Fast diagnosis and personalized treatments for acne

ABSTRACT

Methods of diagnosing and treating patients afflicted with acne, including diagnosing one as having acne if the individual possesses RT4, RT5, RT7, RT8, RT9, or RT10. Methods for treating acne include administering an effective amount of a drug specifically targeting RT4, RT5, RT7, RT8, RT9, or RT10, such as small molecules, antisense molecules, siRNAs, biologics, antibodies, phages, vaccines, or combination thereof.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. patentapplication Ser. No. 15/257,423, filed Sep. 6, 2016, which is adivisional application of U.S. National Stage application Ser. No.14/385,576, filed Sep. 16, 2014, which claims priority to InternationalApplication No. PCT/US2013/032551, filed on Mar. 15, 2013, which claimspriority to U.S. Provisional Patent Application No. 61/612,290, filed onMar. 17, 2012, each of which is incorporated by reference herein in itsentirety for all purposes.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with government support under Grant NumbersAR057503 and GM099530, awarded by the National Institutes of Health. Thegovernment has certain rights in the invention.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy was submitted on Sep. 6, 2016and is 3348.71 kilobytes in size.

BACKGROUND OF THE INVENTION

Acne is a skin condition that causes pimples or “zits.” This includeswhiteheads, blackheads, and red, inflamed patches of skin (such ascysts). Acne occurs when tiny pores on the surface of the skin becomeclogged. Each pore opens to a follicle. A follicle contains a hair andan oil gland. The oil released by the gland helps remove old skin cellsand keeps your skin soft. When glands produce too much oil, the porescan become blocked. Dirt, bacteria, and cells build up. The blockage iscalled a plug or comedone. If the top of the plug is white, it is calleda whitehead. If the top of the plug is dark, it is called a blackhead.If the plug breaks open, swelling and red bumps occur. Acne that is deepin your skin can cause hard, painful cysts. This is called cystic acne.

Acne is most common in teenagers, but anyone can get acne. 85% ofteenagers have acne. Hormonal changes may cause the skin to be moreoily. Acne tends to run in families. It may be triggered by hormonalchanges related to puberty, menstrual periods, pregnancy, birth controlpills, or stress; greasy or oily cosmetic and hair products; certaindrugs (such as steroids, testosterone, estrogen, and phenytoin); or highlevels of humidity and sweating.

Various treatments exist for the treatment of acne. In general, acnetreatments work by reducing oil production, speeding up skin cellturnover, fighting bacterial infection, reducing the inflammation ordoing all four. These types of acne treatments include over-the-countertopical treatments, antibiotics, oral contraceptives and cosmeticprocedures. Acne lotions may dry up the oil, kill bacteria and promotesloughing of dead skin cells. Over-the-counter (OTC) lotions aregenerally mild and contain benzoyl peroxide, sulfur, resorcinol,salicylic acid or sulfur as their active ingredient. Studies have foundthat using topical benzoyl peroxide along with oral antibiotics mayreduce the risk of developing antibiotic resistance. Antibiotics maycause side effects, such as an upset stomach, dizziness or skindiscoloration. These drugs also increase your skin's sun sensitivity andmay reduce the effectiveness of oral contraceptives. For deep cysts,antibiotics may not be enough. Isotretinoin (Amnesteem, Claravis,Sotret) is a powerful medication available for scarring cystic acne oracne that doesn't respond to other treatments. However, isotretinoin hasmany side effects, such as dry skin, depression, severe stomach pain,and muscle/joint/back pain, and can cause birth defects in babies whosemothers use isotretinoin. Oral contraceptives, including a combinationof norgestimate and ethinyl estradiol (Ortho Tri-Cyclen, Previfem,others), can improve acne in women. However, oral contraceptives maycause other side effects, such as headaches, breast tenderness, nausea,and depression. Chemical peels and microdermabrasion may be helpful incontrolling acne. These cosmetic procedures, which have traditionallybeen used to lessen the appearance of fine lines, sun damage, and minorfacial scars, are most effective when used in combination with otheracne treatments. They may cause temporary, severe redness, scaling andblistering, and long-term discoloration of the skin.

In addition to the negative side-effects caused by the currentlyavailable treatments, there is no treatment available that ispersonalized to patients to target specific bacteria causing acne on anindividual level. Additionally, it will be useful for dermatologists toknow which strains are dominant on the skin of a patient at the time ofdiagnosis in order to personalize acne treatments. Thus, there exists aneed in the art for methods of personalized diagnoses and treatment ofacne.

BRIEF SUMMARY OF THE INVENTION

The present invention is directed to methods of diagnosis andpersonalized treatment in patients afflicted with acne.

In one embodiment, the invention provides a method for determiningwhether an individual possesses acne comprising: obtaining a skin samplefrom an individual; isolating bacterial DNA from said sample; amplifying16S ribosomal DNA in said sample; sequencing said amplified DNAproducts; and typing the individual's DNA based on one or more of theten major ribotypes (RTs) of P. acnes strains, RT1-RT10 (SEQ ID NOs1-10), wherein said typing occurs by determining whether said individualpossesses one or more of RT1-RT10 and wherein said individual isdiagnosed as having acne if said individual possesses RT4, RT5, RT7,RT8, RT9, or RT10. For example, said individual may be diagnosed ashaving acne if said individual possesses RT4 (SEQ ID NO:4), RT5 (SEQ IDNO:5), or RT8 (SEQ ID NO:8).

In another embodiment, the invention provides a method for diagnosingdifferent types of acne comprising: obtaining a skin sample from asubject; isolating bacterial DNA from said sample; amplifying 16Sribosomal DNA in said sample; sequencing said amplified DNA products;and typing the subject's DNA based on one or more of the five majormicrobiome types of P. acnes strains, wherein said subject is diagnosedas having acne if said subject is typed to microbiome IV or V.

In yet another embodiment, the invention provides a method for rapidlydiagnosing acne comprising: obtaining a skin sample from a subject;isolating bacterial DNA from said sample; using one or more primer setsto amplify said DNA; and analyzing said amplified DNA for the presenceof a sequence having at least 95% homology with at least one of SEQ IDNOs 29-32 and 82-434, wherein said subject is diagnosed as having acneif the presence of a sequence having at least 95% homology with at leastone of SEQ ID NOs 29-32 and 82-434 exists. For example, said amplifiedDNA may be analyzed for the presence of a sequence having at least 99%homology with at least one of SEQ ID NOs 29-32 and 82-434 and whereinsaid subject is diagnosed as having acne if the presence of a sequencehaving at least 99% homology with at least one of SEQ ID NOs 29-32 and82-434 exists. As another example, said amplified DNA may be analyzedfor the presence of at least one of SEQ ID NOs 29-32 and 82-434 andwherein said subject is diagnosed as having acne if the presence of atleast one of SEQ ID NOs 29-32 and 82-434 exists.

In another embodiment, the invention provides a method for rapidlydiagnosing acne comprising: obtaining a skin sample from a subject;isolating bacterial DNA from said sample; using one or more primer setsto amplify said DNA; using one or more probes to detect said amplifiedDNA; and analyzing said probe signals for the presence of Locus 1 (atleast one sequence having at least 95% homology to at least one of SEQID NOs 29 and 82-97), Locus 2 (at least one sequence having at least 95%homology to at least one of SEQ ID NOs 30 and 98-186), Locus 3 (at leastone sequence having at least 95% homology to at least one of SEQ ID NOs31 and 187-423), and/or Locus 4 (at least one sequence having at least95% homology to at least one of SEQ ID NOs 32 and 424-434), wherein saidsubject is diagnosed as having acne if one or more of Loci 1-4 arepresent. For example, the signals may be analyzed for the presence ofLocus 1, Locus 2, Locus 3, and/or Locus 4 based upon at least 99%homology or 100% homology.

In the foregoing methods, a primer of said primer sets may be selectedfrom the group consisting of SEQ ID NOs 11, 12, 17, and 18 (for Locus1), SEQ ID NOs 13, 14, 20, and 21 (for Locus 2), SEQ ID NOs 15, 16, 23,and 24 (for Locus 3), and SEQ ID NOs 26 and 27 (for Locus 4). In theforegoing methods, said probes may be SEQ ID NO:19 (for Locus 1), SEQ IDNO:22 (for Locus 2), SEQ ID NO:25 (for Locus 3), and SEQ ID NO:28 (forLocus 4).

In yet another embodiment, the invention provides a vaccine for theprevention and/or treatment of acne caused by P. acnes comprising a heatinactivated P. acnes strain, an attenuated protein of said strain, orcombination thereof, wherein said strain is an RT4 strain, an RT5strain, an RT7 strain, an RT8 strain, an RT9 strain, or an RT10 strain.

In yet another embodiment, the invention provides a vaccine for theprevention and/or treatment of acne caused by P. acnes comprising a heatinactivated P. acnes strain, an attenuated protein of said strain, orcombination thereof identified to be specific to a subject based on 16SrDNA sequence analysis of the strains of P. acnes affecting saidsubject.

With regard to the vaccines, said heat inactivated P. acnes strain,attenuated protein, or combination thereof may be specific for at leastone of unique genomic loci, regions, or sequences identified for thestrains of P. acnes. Said heat inactivated P. acnes strain, attenuatedprotein, or combination thereof may be specific for at least one ofLocus 1 (SEQ ID NOs 29 and 82-97), Locus 2 (SEQ ID NOs 30 and 98-186),Locus 3 (31 and 187-423), and Locus 4 (32 and 424-434).

In yet another embodiment, the invention provides a method for thepersonalized treatment of acne comprising determining the strains of P.acnes affecting a subject and treating said subject with an activeingredient directed to at least one detected strain of P. acnes, whereinthe active ingredient comprises a drug targeting specific strains of P.acnes, wherein the targeting drug comprises small molecules, antisensemolecules, siRNA, biologics, antibodies, and combinations thereoftargeting genomic elements specific for strains of P. acnes associatedwith acne.

In yet another embodiment, the invention provides a method for treatingacne comprising: administering an effective amount of a probiotic thatcomprises at least one strain of P. acnes that is associated withhealthy or normal skin based on its 16S rDNA. Said strain may be an RT6strain. Said strain may have at least 95% homology to SEQ ID NO:51, SEQID NO:52, SEQ ID NO:53, or SEQ ID NO:54, such as at least 99% homologyor 100% homology.

In yet another embodiment, the invention provides a method for treatingacne comprising: administering an effective amount of a metaboliteproduced by a strain of P. acnes that is associated with healthy ornormal skin, wherein said metabolite is selected from the groupcomprising bacterial culture supernatant, cell lysate, proteins, nucleicacids, lipids, and other bacterial molecules. Said strain may be an RT6strain. Said strain may have at least 95% homology to SEQ ID NO:51, SEQID NO:52, SEQ ID NO:53, or SEQ ID NO:54, such as at least 99% homologyor 100% homology.

In yet another embodiment, the invention provides a method for treatingacne in a subject comprising: administering an effective amount of adrug specifically targeting RT4, RT5, RT7, RT8, RT9, or RT10, when saidsubject is determined to possess RT4, RT5, RT7, RT8, RT9, or RT10,respectively. The earlier-described methods may be performed prior toadministration of said drug. Said drug may be a small molecule,antisense molecule, siRNA, biologic, antibody, or combination thereof.

In yet another embodiment, the invention provides a compositioncomprising at least one strain of P. acnes that is associated withhealthy or normal skin. Said strain may be an RT6 strain. Said strainmay have at least 95% homology to SEQ ID NO:51, SEQ ID NO:52, SEQ IDNO:53, or SEQ ID NO:54, such as at least 99% homology or 100% homology.

In yet another embodiment, the invention provides a method fordiagnosing IB-3-based acne comprising: obtaining a skin sample from asubject; isolating bacterial DNA from said sample; using one or moreprimer sets to amplify said DNA; and analyzing said amplified DNA forthe presence of a sequence having at least 95% homology with at leastone of SEQ ID NOs 55-81, wherein said subject is diagnosed as havingIB-3-based acne if the presence of a sequence having at least 95%homology with at least one of SEQ ID NOs 55-81 exists.

In yet another embodiment, the invention provides a method for thepersonalized treatment of acne comprising determining the strain(s) ofacne affecting a subject and administering to said subject an effectiveamount of at least one phage specifically directed to said strain(s).For example, the subject may be treated with phage directed against anRT4 strain, an RT5 strain, an RT7 strain, and RT8 strain, an RT9 strain,and/or an RT10 strain.

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type I comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL060L00(SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05 (SEQ ID NO:41),PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:40), PHL085N00 (SEQ IDNO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ ID NO:44), PHL114L00(SEQ ID NO:37), PHL010M04 (SEQ ID NO:38), and PHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type I with IB-3 straincomprising administering to said individual an effective amount of aphage, wherein said phage is selected from the group consisting of:PHL082M00 (SEQ ID NO:47) and PHL071N05 (SEQ ID NO:41).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type II comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL060L00 (SEQ ID NO:34), PHL112N00 (SEQ ID NO:35), andPHL085M01 (SEQ ID NO:44).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type III or dominant RT8comprising administering to said individual an effective amount of aphage, wherein said phage is selected from the group consisting of:PHL113M01 (SEQ ID NO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ IDNO:47), PHL060L00 (SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05(SEQ ID NO:41), PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:45),PHL085N00 (SEQ ID NO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ IDNO:44), PHL114L00 (SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), PHL010M04(SEQ ID NO:38), and PHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type IV comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL060L00(SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05 (SEQ ID NO:41),PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:45), PHL085N00 (SEQ IDNO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ ID NO:44), PHL114L00(SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), PHL010M04 (SEQ ID NO:38), andPHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type V comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL060L00(SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05 (SEQ ID NO:41),PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:45), PHL085N00 (SEQ IDNO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ ID NO:44), PHL114L00(SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), PHL010M04 (SEQ ID NO:38), andPHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatinga Propionibacterium humerusii-associated malady comprising administeringto said individual an effective amount of a phage, wherein said phage isselected from the group consisting of: PHL113M01 (SEQ ID NO:36),PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL067M10 (SEQ IDNO:42), PHL071N05 (SEQ ID NO:41), PHL085N00 (SEQ ID NO:46), PHL085M01(SEQ ID NO:44), PHL114L00 (SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), andPHL010M04 (SEQ ID NO:38).

In yet another embodiment, the invention provides a kit for diagnosingacne in a subject, wherein said kit comprises: at least one primerselected from the group comprising SEQ ID NOs 11-18, 20, 21, 23, 24, 26,and 27; and instructions for use.

In yet another embodiment, the invention provides a kit for diagnosingacne in a subject, wherein said kit comprises: at least one primerselected from the group comprising SEQ ID NOs 11-18, 20, 21, 23, 24, 26,and 27; at least one probe selected from the group comprising SEQ ID NOs19, 22, 25, and 28; and instructions for use.

BRIEF DESCRIPTION OF THE FIGURES

This application file contains at least one drawing executed in color.Copies of this application with color drawing(s) will be provided by theOffice upon request and payment of the necessary fee.

FIG. 1 shows that P. acnes dominates the microbiota of pilosebaceousunits, accounting for 87% of the clones. P. acnes was dominant inpilosebaceous units in both acne patients and individuals with normalskin. By 16S rDNA sequencing, P. acnes sequences accounted for 87% ofall the clones. Species with a relative abundance greater than 0.35% arelisted in order of relative abundance. Species distribution from ametagenomic shotgun sequencing of pooled samples from normal individualsconfirmed the high abundance of P. acnes in pilosebaceous units, asshown on the far right column.

FIG. 2 shows that the rank abundance of P. acnes ribotypes shows adistribution similar to that seen at the higher taxonomic levels. A fewhighly-abundant ribotypes and a large number of rare ribotypes wereobserved in the samples. Some ribotypes were highly enriched in acnepatients. Only the top 30 most abundant ribotypes are reflected in FIG.2.

FIG. 3 shows that the most abundant P. acnes ribotypes in pilosebaceousunits were also abundant at other body sites. The major ribotypes foundin acne patients and normal individuals were compared to the datasetsfrom the HMP and Grice et al. (2009). The top three ribotypes are themost abundant ones in different datasets. The excess RT4 and RT5 seen inthe dataset by Grice et al. (2009) was due to one subject, HV4, whose P.acnes strain population was dominated by these two ribotypes at everyskin site sampled. After removal of this subject, the ribotypedistribution is similar to the HMP samples and the normal skin samplesstudied. RT6 is also found abundant in the HMP dataset, which werecollected from healthy individuals.

FIG. 4 shows that P. acnes population structures differ in acne andnormal skin. P. acnes populations from samples were clustered usingprincipal coordinates analysis of the weighted UniFrac distance matrixfor the top ten most abundant ribotypes. The principal coordinate 1 (P1)explains 43.64% of the variation and P2 explains 20.07% of thevariation. The analysis was performed using QIIME (Caporaso et al.2010).

FIG. 5 shows the distribution of the top ten most abundant P. acnesribotypes in acne patients and individuals with normal skin. Each columnrepresents the percentage of the top ten ribotypes identified in eachsubject. The average P. acnes clone number per subject was 262 and theaverage clone number of top ten ribotypes was 100. Five major microbiometypes at the P. acnes strain level were observed in the data. Types IVand V were mostly found in acne patients. Two samples (one from acne,one from normal skin) with fewer than 50 P. acnes 16S rDNA sequences arenot displayed.

FIG. 6 shows the distribution of the top ten most abundant P. acnesribotypes in all samples without separating the two groups of acne andnormal skin. Each column represents the percentage of the top tenribotypes identified in each sample. When all samples were clustered,the same five major microbiome types at the P. acnes strain level wereobserved, indicating that microbiome classification does not depend onthe states of the disease. Only three out of 99 samples were clustereddifferently compared to the one shown in FIG. 5 (marked with asterisks).Two samples, one from acne and one from normal skin, with fewer than 50P. acnes 16S rDNA sequences are not shown.

FIG. 7 shows that the same five major microbiome types were observed inmultiple datasets. Samples from the study, HMP, and Grice et al. (2009)were clustered together based on the top ten most abundant P. acnesribotypes. In total, 284 samples were included. Each column representsthe percentage of the top ten ribotypes identified in each sample. BothHMP samples and samples from Grice et al. (2009) were collected fromhealthy individuals, therefore the percentage of microbiome types IV andV are under-represented in the analysis. Samples with fewer than tensequences of the top ten ribotypes were not included.

FIG. 8 indicates that the genome comparison of 71 P. acnes strainsshowed that the genomes of RT4 and RT5 are distinct from others. Twochromosomal regions, loci 1 and 2, are unique to clade IA-2 and oneother genome HL086PA1. Clade IA-2 consists of mainly RT4 and RT5 thatwere highly enriched in acne. The presence of a plasmid (locus 3) isalso characteristic of RT4 and RT5. Each row represents a P. acnesgenome colored according to the ribotypes. Rows are ordered by thephylogeny calculated based on the SNPs in the P. acnes core genome. Onlythe topology is shown. The clades were named based on their recA types(IA, IB and II). Columns represent predicted open reading frames (ORFs)in the genomes and are ordered by ORF positions along the finishedgenome HL096PA1, which encodes a 55 Kb plasmid. Only the first 300 ORFson the chromosome (on the left) and all the ORFs on the plasmid (on theright) are shown. The colored plasmid regions represent genes on contigsthat match exclusively to the HL096PA1 plasmid region. The genes thatfall on contigs that clearly extend beyond the plasmid region are likelyto be chromosomally located and are colored in grey. Acne index for theribotypes was calculated based on the percentage of clones of eachribotype found in acne as shown in column 5 in Table 1.

FIG. 9 shows the phylogentic tree constructed based on the 96,887 SNPsin P. acnes core genome, which shows that the 71 genomes cluster intodistinct clades, consistent with recA types that have been used toclassify P. acnes strains. The 16S ribotypes of the genomes representthe relationship of the lineages to a large extent. At one end of thetree, clades IA-2 and IB-1 mainly consist of the ribotypes enriched inacne, and at the other end of the tree, RT6 in clade II was mainly foundin healthy subjects. Bootstrap test with 1,000 replicates wereperformed. The distances between the branches were calculated based onthe SNPs in the core genome and do not represent the non-core regions ofeach genome. The enlarged branches were colored according to the 16Sribotypes as shown in FIG. 8.

FIG. 10 provides a genome comparison of 71 P. acnes strains and showsthat the genomes of RT4 and RT5 are distinct from others. All of thepredicted open reading frames (ORFs) encoded on the chromosome areshown. Each row represents a P. acnes genome colored according to theribotypes. Rows are ordered by the phylogeny calculated based on theSNPs in P. acnes core genome. Only the topology is shown. Columnsrepresent ORFs in the genomes and are ordered by their positions alongthe finished genome HL096PA1. Loci 1 and 2, which are unique to mainlyRT4 and RT5 strains, and locus 4, which is unique to mainly RT8 strains,can be seen in the figure.

FIG. 11 provides a sequence coverage comparison between the chromosomeand the plasmid region in all genomes harboring a putative plasmid,which shows that the copy number of plasmid ranges from 1 to 3 pergenome. The X-axis represents the DNA sequences along the chromosomebased on the coordinates of the finished genome HL096PA1, followed byplasmid sequences. The Y-axis represents the sequence coverage. Thegenomes were in the same order as in FIG. 8, except HL056PA1 (as anegative control).

FIG. 12 reflects that quantitative PCR (qPCR) confirmed that the copynumber of plasmid in each genome is 1-3 as predicted from sequencecoverage comparison. Pak and RecA are housekeeping genes located on thechromosome and TadA is a conserved gene in the Tad locus located on theplasmid. The copy number ratio between TadA and Pak ranges from 1 to 3in genomes, while the ratio between RecA and Pak is 1 in all thegenomes. The TadA gene in HL078PA1 and HL045PA1 had amplification inlate cycles in qPCR. Conventional PCR confirmed the amplification ofTadA in these two strains, while other strains without the plasmidshowed no amplification (data not shown).

FIG. 13 shows a power law regression for new genes (n) discovered withthe addition of new genome sequences (N). Circles are the medians of nfor 200 simulations. Error bars indicate the standard deviations for the200 simulations.

FIG. 14 shows a power law regression for total genes (n) accumulatedwith the addition of new genome sequences (N). Circles are the mediansof n for 200 simulations. Error bars indicate the standard deviationsfor the 200 simulations.

FIG. 15 shows the proportion of the 123,223 SNPs in the core regionsspecific to recA types I, II and Ill.

FIG. 16 shows the phylogenetic tree of 82 P. acnes strains constructedbased on the 123,223 SNPs in the core regions (2.20 Mb). The distancesbetween strains were calculated as nucleotide substitution rates at allSNP sites, colored according to the scale bar. The strains from the sameindividuals (SSIs) belonging to the same lineages were marked with “+”.

FIG. 17 shows the pan-genomes of types IA (A), IB (B) and II (C)strains. Circles are the medians of n for 200 simulations. Error barsare standard deviations for the 200 simulations.

FIG. 18 shows the SNP distribution in core regions. (A) shows SNPfrequencies (percentage of polymorphic sites) of the genes in the coreregions. (B) provides K-S statistics for genes that had higher SNPfrequencies with more than two standard deviations (SD). (C) reflectsnon-synonymous mutation frequencies of the genes in the core regions.(D) provides K-S statistics for genes that had higher non-synonymousmutation frequencies with more than 2 standard deviations.

FIG. 19 provides the distances between P. acnes strains in the samelineage (A) and in different lineages (B).

FIG. 20 reflects that P. acnes strains within each lineage share uniquenon-core genomic regions. Rows represent 82 P. acnes genomes and columnsrepresent 314 non-core regions that are longer than 500 bp. The genomesand the non-core regions were clustered based on similarity,respectively. The width of each block plotted is not proportional to thegenomic length of each non-core region. The presence of a non-coreregion is colored in yellow, and the absence is colored in blue. Thecolor schemes used for RT and clades are the same as in FIG. 16.

FIG. 21 provides CRISPR spacer sequences in RT2 and RT6 strains. A totalof 48 CRISPR spacer sequences were found in 11 P. acnes genomes, 29 ofwhich were unique. Some CRISPR spacers were found in multiple strains.For example, spacer 2 (S2) was shared by HL060PA1 and HL082PA2. Spacer17 (S17) was shared by J139, ATCC11828, HL110PA3, HL110PA4, HL042PA3 andHL202PA1. Spacer 18 (S18) was shared by J139, ATCC11828, HL110PA3,HL110PA4, and HL202PA1. The tree was from FIG. 16 constructed based onthe 123,223 SNPs in the core regions.

FIG. 22 reflects genes with putative lipase activity in the P. acnesgenomes. (A) gives a summary of 13 genes with putative lipase activitybased on the annotations of KPA171202 and SK137 genomes. (B) reflectsInsertions/deletions and frameshift observed in ORF HMPREF0675-4856.

FIG. 23 reflects fast detection of acne associated P. acnes strainsusing multiplex PCR targeting loci 1, 2, and 3.

FIG. 24 shows the relative abundances of Locus 1 and Locus 2 as comparedto the housekeeping gene Pak.

FIG. 25 reflects qPCR triplex amplification plots for clinical samples#1 (A) and #2 (B) showing amplification of P. acnes Locus 1, Locus 3,and Pak.

FIG. 26 shows the evolutionary relationships/phylogenetic tree of 32phages.

FIG. 27 shows a diagram of the methods of the invention for thediagnosis and personalization of therapy for acne.

FIG. 28 shows a flow chart of the methods of the invention for thediagnosis and personalization of therapy for acne.

FIG. 29 provides P. acnes phage genomes and annotations. Genomeorganizations of all 15 phages are shown. Hatched arrows in previouslypublished genomes represent newly annotated ORFs proposed. Italicizedlegend entries refer to newly-annotated or revised ORFs.

FIG. 30 provides a phylogenetic tree of 29 sequenced phage genomesconstructed based upon the 6,148 SNPs in the core regions. Branches withbootstrap values less than 80 (based on 200 resamplings) were collapsed.

FIG. 31 provides phylogentic trees based on the genome sequences. (A)provides a phylogenetic tree constructed based on the entire genomesequences of all 16 phages. With the exception of PHL112N00, thephylogenetic relationships among the phages remain the same as using thecore regions only, shown in FIG. 30. (B) shows the phylogenetic treethat was constructed using only the left-arm of the genomes, which arehighly conserved among the phages. (C) shows the phylogenetic tree thatwas constructed using only the right-arm coding regions. Groups I and IIfrom FIG. 30 are also indicated in the trees. Branches with bootstrapvalues less than 80 (based on 5,000 resamplings) were collapsed.

FIG. 32 shows the phylogentic trees constructed based upon thenucleotide sequences of amidase (A) and head protein (B) from allphages, including the sequences from Lood et al. The phylogeneticrelationships among the phages from the previous study remain the samein these trees. Groups I and II remain the same as in the genome shownin FIG. 30.

FIG. 33 reflects multiple alignments generated for genomes from Groups Iand II of closely-related phages. Sites of nucleotide variations aremapped to a member from each group. The density of variable sites ineach 50-nt window of the genome is indicated in red, with 100% densityindicating that all 50 sites in the window vary between the groupmembers. (A) provides variations among Group I phages (PHL010M04,PHL066M04, PHL073M02) mapped to the PHL010M04 genome. (B) providesvariation among Group II phages (PHL115M02, PHL085M01, PHL085N00,PHL037M02) mapped to the PHL115M02 genome. Gray arrows represent ORFs ineach genome.

FIG. 34 shows host range and specificity of P. acnes phages. Thesusceptibility/resistance of 66 P. acnes strains, three P. humerusiistrains, and one P. granulosum strain against 15 newly sequenced phagesis shown. Dendrograms on the top and to the left represent therespective phylogenetic trees of the phages and P. acnes strains (onlytopology is shown). “S” indicates that the tested Propionibacteriumstrain was susceptible to the tested phage. Numbers in red represent thefold increase in resistance of the Propionibacteria strains againstphages relative to P. acnes strain ATCC6919.

FIG. 35 provides a correlation between P. acnes resistance to phages andthe presence of matched CRISPR spacers. The colored pixels in each cellrepresent the CRISPR spacers encoded in each P. acnes strain (shown inrows). Each red pixel means that this spacer has an exact protospacermatch in the corresponding phage (shown in columns). Each orange pixelmeans that this spacer has a partially matched protospacer (one to twomismatches) in the corresponding phage. Gray pixels mean no matchedprotospacers. Pink cells indicate the bacterial resistance to thephages.

FIG. 36 reflects that each of the 15 sequenced phages was aligned to all8 CRISPR spacer arrays identified in the P. acnes strains to identifyprotospacer sequences in each phage genome that have an exact match(red) or up to two mismatches (orange). Plus- and minus-strandprotospacers are shown above and below the genomes, respectively.

FIG. 37 reflects sequence conservation in protospacers and PAMs. Theprotospacers that match exactly to the CRISPR spacers encoded in strainHL042PA3 and their associated PAM sequences are shown. Sequenceconservation among the protospacer motifs from the phages that HL042PA3is resistant to is shown in (A) and susceptible to is shown in (B).

DETAILED DESCRIPTION

In one embodiment, the invention provides a method for determiningwhether an individual possesses acne comprising: obtaining a skin samplefrom an individual; isolating bacterial DNA from said sample; amplifying16S ribosomal DNA in said sample; sequencing said amplified DNAproducts; and typing the individual's DNA based on one or more of theten major ribotypes (RTs) of P. acnes strains, RT1-RT10 (SEQ ID NOs1-10), wherein said typing occurs by determining whether said individualpossesses one or more of RT1-RT10 and wherein said individual isdiagnosed as having acne if said individual possesses RT4, RT5, RT7,RT8, RT9, or RT10. For example, said individual may be diagnosed ashaving acne if said individual possesses RT4 (SEQ ID NO:4), RT5 (SEQ IDNO:5), or RT8 (SEQ ID NO:8).

In another embodiment, the invention provides a method for diagnosingdifferent types of acne comprising: obtaining a skin sample from asubject; isolating bacterial DNA from said sample; amplifying 16Sribosomal DNA in said sample; sequencing said amplified DNA products;and typing the subject's DNA based on one or more of the five majormicrobiome types of P. acnes strains, wherein said subject is diagnosedas having acne if said subject is typed to microbiome IV or V.

In yet another embodiment, the invention provides a method for rapidlydiagnosing acne comprising: obtaining a skin sample from a subject;isolating bacterial DNA from said sample; using one or more primer setsto amplify said DNA; and analyzing said amplified DNA for the presenceof a sequence having at least 95% homology with at least one of SEQ IDNOs 29-32 and 82-434, wherein said subject is diagnosed as having acneif the presence of a sequence having at least 95% homology with at leastone of SEQ ID NOs 29-32 and 82-434 exists. For example, said amplifiedDNA may be analyzed for the presence of a sequence having at least 99%homology with at least one of SEQ ID NOs 29-32 and 82-434 and whereinsaid subject is diagnosed as having acne if the presence of a sequencehaving at least 99% homology with at least one of SEQ ID NOs 29-32 and82-434 exists. As another example, said amplified DNA may be analyzedfor the presence of at least one of SEQ ID NOs 29-32 and 82-434 andwherein said subject is diagnosed as having acne if the presence of atleast one of SEQ ID NOs 29-32 and 82-434 exists.

In another embodiment, the invention provides a method for rapidlydiagnosing acne comprising: obtaining a skin sample from a subject;isolating bacterial DNA from said sample; using one or more primer setsto amplify said DNA; using one or more probes to detect said amplifiedDNA; and analyzing said probe signals for the presence of Locus 1 (atleast one sequence having at least 95% homology to at least one of SEQID NOs 29 and 82-97), Locus 2 (at least one sequence having at least 95%homology to at least one of SEQ ID NOs 30 and 98-186), Locus 3 (at leastone sequence having at least 95% homology to at least one of SEQ ID NOs31 and 187-423), and/or Locus 4 (at least one sequence having at least95% homology to at least one of SEQ ID NOs 32 and 424-434), wherein saidsubject is diagnosed as having acne if one or more of Loci 1-4 arepresent. For example, the signals may be analyzed for the presence ofLocus 1, Locus 2, Locus 3, and/or Locus 4 based upon at least 99%homology or 100% homology.

In the foregoing methods, a primer of said primer sets may be selectedfrom the group consisting of SEQ ID NOs 11, 12, 17, and 18 (for Locus1), SEQ ID NOs 13, 14, 20, and 21 (for Locus 2), SEQ ID NOs 15, 16, 23,and 24 (for Locus 3), and SEQ ID NOs 26 and 27 (for Locus 4). In theforegoing methods, said probes may be SEQ ID NO:19 (for Locus 1), SEQ IDNO:22 (for Locus 2), SEQ ID NO:25 (for Locus 3), and SEQ ID NO:28 (forLocus 4).

In yet another embodiment, the invention provides a vaccine for theprevention and/or treatment of acne caused by P. acnes comprising a heatinactivated P. acnes strain, an attenuated protein of said strain, orcombination thereof, wherein said strain is an RT4 strain, an RT5strain, an RT7 strain, an RT8 strain, an RT9 strain, or an RT10 strain.

In yet another embodiment, the invention provides a vaccine for theprevention and/or treatment of acne caused by P. acnes comprising a heatinactivated P. acnes strain, an attenuated protein of said strain, orcombination thereof identified to be specific to a subject based on 16SrDNA sequence analysis of the strains of P. acnes affecting saidsubject.

With regard to the vaccines, said heat inactivated P. acnes strain,attenuated protein, or combination thereof may be specific for at leastone of unique genomic loci, regions, or sequences identified for thestrains of P. acnes. Said heat inactivated P. acnes strain, attenuatedprotein, or combination thereof may be specific for at least one ofLocus 1 (SEQ ID NOs 29 and 82-97), Locus 2 (SEQ ID NOs 30 and 98-186),Locus 3 (31 and 187-423), and Locus 4 (32 and 424-434).

In yet another embodiment, the invention provides a method for thepersonalized treatment of acne comprising determining the strains of P.acnes affecting a subject and treating said subject with an activeingredient directed to at least one detected strain of P. acnes, whereinthe active ingredient comprises a drug targeting specific strains of P.acnes, wherein the targeting drug comprises small molecules, antisensemolecules, siRNA, biologics, antibodies, and combinations thereoftargeting genomic elements specific for strains of P. acnes associatedwith acne.

In yet another embodiment, the invention provides a method for treatingacne comprising: administering an effective amount of a probiotic thatcomprises at least one strain of P. acnes that is associated withhealthy or normal skin based on its 16S rDNA. Said strain may be an RT6strain. Said strain may have at least 95% homology to SEQ ID NO:51, SEQID NO:52, SEQ ID NO:53, or SEQ ID NO:54, such as at least 99% homologyor 100% homology.

In yet another embodiment, the invention provides a method for treatingacne comprising: administering an effective amount of a metaboliteproduced by a strain of P. acnes that is associated with healthy ornormal skin, wherein said metabolite is selected from the groupcomprising bacterial culture supernatant, cell lysate, proteins, nucleicacids, lipids, and other bacterial molecules. Said strain may be an RT6strain. Said strain may have at least 95% homology to SEQ ID NO:51, SEQID NO:52, SEQ ID NO:53, or SEQ ID NO:54, such as at least 99% homologyor 100% homology.

In yet another embodiment, the invention provides a method for treatingacne in a subject comprising: administering an effective amount of adrug specifically targeting RT4, RT5, RT7, RT8, RT9, or RT10, when saidsubject is determined to possess RT4, RT5, RT7, RT8, RT9, or RT10,respectively. The earlier-described methods may be performed prior toadministration of said drug. Said drug may be a small molecule,antisense molecule, siRNA, biologic, antibody, or combination thereof.

In yet another embodiment, the invention provides a compositioncomprising at least one strain of P. acnes that is associated withhealthy or normal skin. Said strain may be an RT6 strain. Said strainmay have at least 95% homology to SEQ ID NO:51, SEQ ID NO:52, SEQ IDNO:53, or SEQ ID NO:54, such as at least 99% homology or 100% homology.

In yet another embodiment, the invention provides a method fordiagnosing IB-3-based acne comprising: obtaining a skin sample from asubject; isolating bacterial DNA from said sample; using one or moreprimer sets to amplify said DNA; and analyzing said amplified DNA forthe presence of a sequence having at least 95% homology with at leastone of SEQ ID NOs 55-81, wherein said subject is diagnosed as havingIB-3-based acne if the presence of a sequence having at least 95%homology with at least one of SEQ ID NOs 55-81 exists.

In yet another embodiment, the invention provides a method for thepersonalized treatment of acne comprising determining the strain(s) ofacne affecting a subject and administering to said subject an effectiveamount of at least one phage specifically directed to said strain(s).For example, the subject may be treated with phage directed against anRT4 strain, an RT5 strain, an RT7 strain, and RT8 strain, an RT9 strain,and/or an RT10 strain.

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type I comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL060L00(SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05 (SEQ ID NO:41),PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:40), PHL085N00 (SEQ IDNO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ ID NO:44), PHL114L00(SEQ ID NO:37), PHL010M04 (SEQ ID NO:38), and PHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type I with IB-3 straincomprising administering to said individual an effective amount of aphage, wherein said phage is selected from the group consisting of:PHL082M00 (SEQ ID NO:47) and PHL071N05 (SEQ ID NO:41).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type II comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL060L00 (SEQ ID NO:34), PHL112N00 (SEQ ID NO:35), andPHL085M01 (SEQ ID NO:44).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type III or dominant RT8comprising administering to said individual an effective amount of aphage, wherein said phage is selected from the group consisting of:PHL113M01 (SEQ ID NO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ IDNO:47), PHL060L00 (SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05(SEQ ID NO:41), PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:45),PHL085N00 (SEQ ID NO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ IDNO:44), PHL114L00 (SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), PHL010M04(SEQ ID NO:38), and PHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type IV comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL060L00(SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05 (SEQ ID NO:41),PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:45), PHL085N00 (SEQ IDNO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ ID NO:44), PHL114L00(SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), PHL010M04 (SEQ ID NO:38), andPHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatingan individual suffering from acne of microbiome type V comprisingadministering to said individual an effective amount of a phage, whereinsaid phage is selected from the group consisting of: PHL113M01 (SEQ IDNO:36), PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL060L00(SEQ ID NO:34), PHL067M10 (SEQ ID NO:42), PHL071N05 (SEQ ID NO:41),PHL112N00 (SEQ ID NO:35), PHL037M02 (SEQ ID NO:45), PHL085N00 (SEQ IDNO:46), PHL115M02 (SEQ ID NO:43), PHL085M01 (SEQ ID NO:44), PHL114L00(SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), PHL010M04 (SEQ ID NO:38), andPHL066M04 (SEQ ID NO:39).

In yet another embodiment, the invention provides a method for treatinga Propionibacterium humerusii-associated malady comprising administeringto said individual an effective amount of a phage, wherein said phage isselected from the group consisting of: PHL113M01 (SEQ ID NO:36),PHL111M01 (SEQ ID NO:33), PHL082M00 (SEQ ID NO:47), PHL067M10 (SEQ IDNO:42), PHL071N05 (SEQ ID NO:41), PHL085N00 (SEQ ID NO:46), PHL085M01(SEQ ID NO:44), PHL114L00 (SEQ ID NO:37), PHL073M02 (SEQ ID NO:40), andPHL010M04 (SEQ ID NO:38).

In yet another embodiment, the invention provides a kit for diagnosingacne in a subject, wherein said kit comprises: at least one primerselected from the group comprising SEQ ID NOs 11-18, 20, 21, 23, 24, 26,and 27; and instructions for use.

In yet another embodiment, the invention provides a kit for diagnosingacne in a subject, wherein said kit comprises: at least one primerselected from the group comprising SEQ ID NOs 11-18, 20, 21, 23, 24, 26,and 27; at least one probe selected from the group comprising SEQ ID NOs19, 22, 25, and 28; and instructions for use.

Nucleotide, polynucleotide, or nucleic acid sequence will be understoodto mean both a double-stranded or single-stranded DNA in the monomericand dimeric forms and the transcription products of said DNAs.

Homologous nucleotide sequence means a nucleotide sequence having atleast a percentage identity with the bases of a nucleotide sequenceaccording to the invention of at least 80%, preferably 90%, 95%, 96%,97%, 98%, 99% or 100%. This percentage is statistical and thedifferences between two nucleotide sequences may be determined at randomor over the whole of their length.

The invention comprises the polypeptides encoded by a nucleotidesequence according to the invention, including a polypeptide whosesequence is represented by a fragment. Herein, the terms polypeptide,peptide, and protein are interchangeable.

Polypeptides allow monoclonal or polyclonal antibodies to be preparedwhich are characterized in that they specifically recognize thepolypeptides. The invention relates to mono- or polyclonal antibodies ortheir fragments, or chimeric antibodies, characterized in that they arecapable of specifically recognizing a polypeptide.

Polypeptides used in vaccine compositions according to the invention maybe selected by techniques known to the person skilled in the art suchas, for example, depending on the capacity of said polypeptides tostimulate the T cells, which is translated, for example, by theirproliferation or the secretion of interleukins, and which leads to theproduction of antibodies directed against said polypeptides. Vaccinecombinations will preferably be combined with a pharmaceuticallyacceptable vehicle and, if need be, with one or more adjuvants of theappropriate immunity. Pharmaceutically acceptable vehicle means acompound or a combination of compounds that does not provoke secondaryreactions and which allows, for example, the facilitation of theadministration of the active compound, an increase in its duration oflife and/or its efficacy in the body, an increase in its solubility insolution, or an improvement in its conservation.

Applicants identified ten major lineages of Propionibacterium acnes andfive major microbiome types in the human pilosebaceous unit (“pore”),where acne arises. Some of the P. acnes lineages and microbiome typesare highly enriched in acne patients and some are associated withhealthy skin. The unique genomic components of each major lineage,including a linear plasmid that is unique to acne-associated lineages,have been identified. This information is used to, for example: (1) fora method/kit to isolate bacterial DNA/RNA from pilosebaceous units fordownstream analysis: (2) rapidly and accurately detect/diagnose/identifythe microbiome type of the affected subject and the major strains of P.acnes present in the pores of the affected subject; (3) develop vaccinesagainst acne-associated P. acnes strains; (4) develop probiotics usingthe strains associated with healthy skin in topical creams, solutions,and the like; (5) develop drugs, including small molecules, biologics,and antibodies targeting the genetic elements and biological pathwaysunique to the P. acnes strains associated with acne, and (6) to developbacteriophage-based strain specific therapy to treat acne.

Once the microbiome type of a subject affected with acne is diagnosed,several approaches described below may be used formulate an effectivetreatment plan. For example, if the subjects have microbiome types IV orV, or are dominated by P. acnes RT10 strains, it is less likely thatantibiotic treatment will succeed because these strains are antibioticresistant. However, other method treatments remain available, such asretinoids.

According to one embodiment of the invention, in a case where thesubject has the virulent ribotypes, including RT4, RT5, and RT8, targetspecific drugs including small molecules, biologics, and antibodies maybe more effective treatments. In a preferred embodiment of theinvention, such a patient may be treated with antibodies targeting thegenetic elements and biological pathways that are unique to P. acnesstrains associated with acne.

According to another embodiment of the invention, in a case where thedominant P. acnes strains affecting the subject do not harbor a set ofCRISPR/Cas, the additional treatment of phage therapy may be moreeffective.

The present invention also pertains to alternative treatment strategiesfor acne treatment to balance the relative abundance of P. acnes strainsby promoting the growth of health-associated strains.

The present invention pertains to methods and kits to isolate bacterialDNA/RNA from pores of affected subjects for downstream genetic analysis.More specifically, the present invention pertains to protocols for theextraction of bacterial genomic DNA and RNA from microcomedone samples.In one particular embodiment of the invention, Biore® Deep CleansingPore Strips may be used to sample the bacteria from a subject. GenomicDNA may be extracted according to methods known in the art. For example,the QIAamp DNA Micro Kit (Qiagen) is a commercially available kit thatmay be used to extract genomic DNA from the supernatant obtained bylysing cells/microcomedones using a beadbeater.

The present invention also pertains to fast and accurate methods andkits for the detection and/or diagnosis of microbiome types in affectedsubjects. The microbiome typing/microbiome-specific treatment is basedon ten major lineages of P. acnes strains and five major microbiometypes in the human pilosebaceous unit found through a comprehensivemetagenomic analysis using full length 16S rDNA sequencing.

Indeed, samples were PCR-amplified using 16S rDNA specific primers withthe following sequences: 27f-MP 5′AGRGTTTGATCMTGGCTCAG-3′ and 1492r-MP5′-TACGGYTACCTTGTTAYGACTT-3′. Optionally, following gel purification,the 1.4 Kb product is excised and further purified using, for example, aQuigen QIAquick Gel Extraction Kit. The purified product is cloned intoOneShot E coli. cells using, for example, a TOPO TA cloning kit fromInvitrogen. Sequencing is done with a universal forward, universalreverse, and for a subset, internal 16S rDNA primer 907R with sequencesof TGTAAAACGACGGCCAGT (forward), CAGGAAACAGCTATGACC (reverse), andCCGTCAATTCCTTTRAGTTT (907R). Sequence reactions were loaded on ABI 3730machines from ABI on 50 cm arrays with a long read run module.

Each lineage of P. acnes has unique genomic loci, regions, andsequences. Accordingly, specific primers may be generated to target thelineage-specific genomic regions to detect the presence or absence ofeach lineage, as well as the relative amount of each lineage usingmethods known in the art, such as PCR/qPCR. This occurs within severalhours of obtaining the samples. Prior to Applicants' invention, thisrequired much more time—often weeks using culture-based methods.According to one embodiment of the invention, affected subjects aregrouped for microbiome specific treatments based on these diagnoses.

According to the methods of the present invention, unique genomic loci1, 2, and 3 for strains of ribotypes 4 and 5 have been shown to beassociated with acne. Using specific primers targeting for loci 1, 2 and3, lineages that contain these loci can be distinguished from lineagesthat lack these loci. In addition, using PCR/qPCR techniques, therelative abundance of each strain may also be detected. Analysis of amock community has shown that isolates with loci 1, 2 and 3 in anabundance of 7.5% or higher in the microbiome may be detected usingthese techniques. Given the sensitivity of qPCR, lower abundance levelsto a few DNA copies may also be detectable.

It has previously been reported that heat inactivation of P. acnes maybe an effective means of developing P. acnes-based vaccines. See T.Nakatsuji et al., 128(10) J. Invest. Dermatol. 2451-2457 (October 2008).In one aspect of the present invention, vaccines are developed againstacne-associated P. acnes strains. In another aspect of the presentinvention, personalized vaccines are developed against acne-associatedP. acnes strains. In yet another aspect of the present invention,vaccines are developed against acne-associated P. acnes strains usinginactive P. acnes strains or heat attenuated proteins. Strains suitablefor use as vaccines may be identified based on 16S rDNA sequencing,indentifying lineages of P. acnes strains associated with acne, and theunique genomic loci, regions, and sequences for each lineage tospecifically target strains of P. acnes associated with acne and notthose strains associated with healthy skin.

According to methods described above, it has been discovered that P.acnes strains with ribotypes 4, 5, 7, 8, 9, and 10 are highly associatedwith acne. In one embodiment of the present invention, a vaccine israised against these individual strains separately or in combination.Similarly, the genes in loci 1, 2, and 3 may be targets for vaccinationbecause these loci are unique to ribotypes 4 and 5, and are not found incommensal strains. Locus 4, which is unique to ribotype 8 may also serveas a potential target for vaccine therapy. The list of genes encoded inloci 1, 2, 3, and 4 are shown in Table 2.

The present invention also pertains to probiotics developed using P.acnes strains associated with healthy skin in medicines, compositions,topical creams, solutions, or other cosmetic products. Probiotics have,in the past, been used in topical creams. PROBIOTIC LAB™ announced thatmixture of 14 specific strains of bacteria was used for treatment ofcystic acne (www.probiotic-lab.com/aboutusprobioticlab.html). Probioticskin care/DERMBIOTIX has a product line—Probiotic Collagen Complex(PC3), which is claimed to have targeted anti-aging benefits to theskin. However, this is not targeted to acne treatment. ProbioticCollagen Complex (PC3) infuses the skin with the positive bacteriarequired to effectively combat and eradicate excess negative bacteriacaused by external factors (www.dermbiotix.com). However, prior to thepresent invention there existed no skin probiotic product reported foracne treatment using P. acnes strains associated with healthy/normalskin. In one aspect of the present invention, skin probiotics aredeveloped for acne treatment using P. acnes strains associated withhealthy/normal skin. In another aspect of the present invention, skinprobiotics are developed for acne treatment using P. acnes strainsassociated with healthy/normal skin based on the 16S rDNA sequencing.

In one particular embodiment of the present invention the RT6 lineage ofP. acnes and associated with healthy skin is used as a topical product.In yet another embodiment of the present invention the RT6 lineage of P.acnes is used by inoculating this isolate on the human skin in order tocompete off the acne associated strains. In another embodiment,molecules, including proteins, nucleic acids, lipids, and othermetabolites, supernatant of cultures, and/or cell lysate of thesestrains may be used at probiotics.

The present invention also pertains to drugs targeting acne associatedP. acnes strains. This is based upon multiple genome comparison of P.acnes in combination with 16S rDNA metagenomic analysis, therebyidentifying certain strains and genomic variations associated with acne.Drugs intended to target acne associated P. acnes include customdesigned small molecules, antisense molecules, siRNA molecules,biologics, and antibodies targeting genomic elements specific forstrains which are associated with acne. Antisense RNA, antibodies, orsmall molecules can be designed targeting loci 1, 2, 3, and 4. Strainswith ribotypes 4, 5, and 10 are antibiotic resistant. Thus, there is aneed in the art for new antibiotics targeting ribotypes 4, 5, and 10.

The present invention also pertains to personalized phage therapy forsubjects affected with acne comprising phages specific to certainstrains of P. acnes. Certain companies provide phage therapy for acnepatients, such as the Phage Therapy Center™,www.phagetherapycenter.com/pii/PatientServlet? command=static_home).However, such companies provide no information on the bacterial hostspecificity of the phages used for the therapy. P. acnes is commensaland some strains play a protective role for hosts. In one embodiment ofthe invention, personalized phage therapies include a selections ofphages targeting P. acnes strains that have been shown to lack aprotective role for subjects affected by acne. In yet another embodimentof the invention, personalized phage therapy may be developed accordingto their bacterial host specificity of the phages to target specificstrains of P. acnes, leaving health associated strains intact. Inaddition, it is possible to identify the structure of P. acnes lineagesof the affected subjects and use that structure to predict resistance tophage infection or plasmid conjugation to better target specific phagetherapies. For example, P. acnes lineages RT2 and RT6 have a CRISPR/Casstructure, indicating they have resistance against certain phageinfection and plasmid conjugation. Table 5 shows the sensitivity andresistance of specific P. acnes strains to specific P. acnes phages.

The invention is described in more detail in the following illustrativeexamples. Although the examples may represent only selected embodimentsof the invention, the following examples are illustrative only and in noway limiting.

Examples Example 1—Analysis of Propionibacterium Acnes StrainPopulations in the Human Skin Microbiome Associated with Acne

The human skin microbiome plays important roles in skin health anddisease. However, prior to Applicants' invention the bacterialpopulation structure and diversity at the strain level was poorlyunderstood. The inventors compared the skin microbiome at the strainlevel and genome level of Propionibacterium acnes, a dominant skincommensal, between 49 acne patients and 52 healthy individuals bysampling the pilosebaceous units on their noses. Metagenomic analysisdemonstrated that while the relative abundances of P. acnes weresimilar, the strain population structures were significantly differentin the two cohorts. Certain strains were highly associated with acne andother strains were enriched in healthy skin. By sequencing 66 novel P.acnes strains and comparing 71 P. acnes genomes, the inventorsidentified potential genetic determinants of various P. acnes strains inassociation with acne or health. The analysis indicates that acquiredDNA sequences and bacterial immune elements may play roles indetermining virulence properties of P. acnes strains and some may betargets for therapeutic interventions. This study demonstrates apreviously-unreported paradigm of commensal strain populations thatexplains the pathogenesis of human diseases. It underscores theimportance of strain level analysis of the human microbiome to definethe role of commensals in health and disease.

BACKGROUND

The diversity of the human microbiota at the strain level and itsassociation with human health and disease are largely unknown. However,many studies had shown that microbe-related human diseases are oftencaused by certain strains of a species, rather than the entire speciesbeing pathogenic. Examples include methicillin-resistant Staphylococcusaureus (MRSA) (Chambers and Deleo, 2009; Chen et al., 2010; Hansra andShinkai) and Escherichia coli O157 (Chase-Topping et al., 2008; Tarr etal., 2005). Acne vulgaris (commonly called acne) is one of the mostcommon skin diseases with a prevalence of up to 85% of teenagers and 11%of adults (White, 1998). Although the etiology and pathogenesis of acneare still unclear, microbial involvement is considered one of the mainmechanisms contributing to the development of acne (Bojar and Holland,2004; Cunliffe, 2002). In particular, Propionibacterium acnes has beenhypothesized to be an important pathogenic factor (Webster, 1995).Antibiotic therapy targeting P. acnes has been a mainstay treatment formore than 30 years (Leyden, 2001). However, despite decades of study, itremained unclear as to how P. acnes contributes to acne pathogenesiswhile being a major commensal of the normal skin flora (Bek-Thomsen etal., 2008; Cogen et al., 2008; Costello et al., 2009; Dominguez-Bello etal., 2010; Fierer et al., 2008; Gao et al., 2007; Grice et al., 2009).Whether P. acnes protects the human skin as a commensal bacterium orfunctions as a pathogenic factor in acne, or both, remained to beelucidated.

Thus, Applicants compared the skin microbiome at the strain level andgenome level in 49 acne patients and 52 normal individuals using acombination of metagenomics and genome sequencing. First, for eachsample, 16S ribosomal DNA (rDNA) was amplified, approximately 400 cloneswere sequenced, and an average of 311 nearly full length 16S rDNAsequences were analyzed. The population structure of P. acnes strainswas determined in each sample. Second, each P. acnes strain was assignedan “acne index” by calculating its prevalence in acne patients based onthe 16S rDNA metagenomic data. The P. acnes strains associated with theacne patient group were identified, as well as the strains enriched inthe individuals with normal skin. This metagenomic approach isfundamentally different than prior approaches in determining diseaseassociations; it is more powerful and less biased than traditionalmethods by bypassing the biases and selection in strain isolation andculturing. Lastly, 66 novel P. acnes strains were sequenced and 71 P.acnes genomes compared covering the major lineages of P. acnes found inthe skin microbiota. By combining a metagenomic study of the skinmicrobiome and genome sequencing of this major skin commensal,Applicants' study provided insight into bacterial genetic determinantsin acne pathogenesis and emphasizes the importance of strain levelanalysis of the human microbiome to understand the role of commensals inhealth and disease.

Results

P. acnes Dominates the Pilosebaceous Unit

Applicants characterized the microbiome in pilosebaceous units (“pores”)on the nose collected from 49 acne patients and 52 individuals withnormal skin. Nearly full length 16S rDNA sequences were obtained usingSanger method, which permitted analyzing the P. acnes at the strainlevel. After quality filtering, the final dataset contained 31,461 16SrDNA sequences ranging from position 29 to position 1483. 27,358 of thesequences matched to P. acnes with greater than 99% identity. The datademonstrated that P. acnes dominates the microbiota of pilosebaceousunits, accounting for 87% of the clones (FIG. 1). Other commonly foundspecies in pilosebaceous units included Staphylococcus epidermidis,Propionibacterium humerusii, and Propionibacterium granulosum, eachrepresenting 1%-2.3% of the total clones. A total of 536 species leveloperational taxonomic units (SLOTUs) belonging to 42 genera and sixphyla were identified in the samples (Table 51).

TABLE S1 Six phyla and 42 genera found in pilosebaceous units. PhylumGenus Phylum Genus Actinobac- Actinobaculum BacteroidetesChryseobacterium teria Corynebacterium Niastella GordoniaPatabacteroides Kocuria Prevotella Mictobacterium Proteobac-Caulobacteraceae Propionibacterium teria Citrobacter FirmicutesAnaerococcus Cupriavidus Anoxybacillus Delftia Bacillus DiaphorobacterEnterococcus Haemophilus Erysipelothrix Klebsiella Finegoldia MassiliaGemella Neisseriaceae Lactobacillus Novosphingobium PaenibacillusPelomonas Peptoniphilus Phyllobacterium Pepto- Ralstoniastreptococcaceae Ruminococcaceae Shigella Staphylococcus SphingomonasStreptococcus Stenotrophomonas Fuso- Fusobacterium Cyanobac-Streptophyta bacteria teria

To bypass the potential biases due to PCR amplification and due touneven numbers of 16S rDNA gene copies among different species, ametagenomic shotgun sequencing of the total DNA pooled from thepilosebaceous unit samples of 22 additional normal individuals wasperformed. Microbial species were identified by mapping metagenomicsequences to reference genomes. The results confirmed that P. acnes wasthe most abundant species (89%) (FIG. 1). This is consistent with theresults obtained from 16S rDNA sequencing (87%).

For the 16S rRNA sequence, positions 27 to 1492 were PCR amplified. Yet,when analyzing the sequence only positions 29-1483 are studied. Thenumbering of positions is based on the E. coli system of nomenclature.Thus, the sequences between 29-1483 are important for determining theribotype (there are many ribotypes, not just 10). As for the top 10ribotypes, sequences between positions 529-1336 of the 16A rRNA aresufficient.

Different P. acnes Strain Populations in Acne

There was no statistically significant difference in the relativeabundance of P. acnes when comparing acne patients and normalindividuals. It was then examined whether there were differences at thestrain level of P. acnes by extensively analyzing the P. acnes 16S rDNAsequences. Herein, each unique 16S rDNA sequence as a 16S rDNA alleletype is called a ribotype (RT). The most abundant P. acnes sequence wasdefined as ribotype 1 (RT1) (SEQ ID NO:1). All other defined ribotypeshave 99% or greater sequence identity to RT1. Similar to thedistributions seen at higher taxonomical levels (Bik et al.), at thestrain level a few ribotypes were highly abundant in the samples with asignificant number of rare ribotypes (FIG. 2). After careful examinationof the sequence chromatograms and manual correction of the sequences, atotal of 11,009 ribotypes were assigned to the P. acnes 16S rDNAsequences. Most of the minor ribotypes were singletons. On average, eachindividual harbored 3±2 P. acnes ribotypes with three or more clones.Based on the genome sequences described below, all the sequenced P.acnes strains have three identical copies of 16S rDNA genes (see notebelow). This allowed the P. acnes strain populations in individualsbased on the 16S rDNA sequences to be compared. The top ten majorribotypes with more than 60 clones and found in multiple subjects areshown in Table 1:

TABLE 1 Top ten most abundant ribotypes found in pilosebaceous unitsPercentage Percentage of clones of clones Nucleotide changes Number ofNumber of from acne from normal Ribotype from RT1 subjects clonespatients^(a) individuals^(b) p-value^(c) RT1 — 90 5536 48% 52% 0.84 RT2T854C 48 1213 51% 49% 0.36 RT3 T1007C 60 2104 40% 60% 0.092 RT4 G1058C,A1201C 23 275 84% 16% 0.049 RT5 G1058C 15 205 99% 1% 0.00050 RT6 T854C,C1336T 11 262 1% 99% 0.025 RT7 G529A 10 188 99% 1% 0.12 RT8 G1004A,T1007C 5 239 100% 0% 0.024 RT9 G1268A 4 68 99% 1% 0.29 RT10 T554C,G1058C 5 61 100% 0% 0.024 ^(a)The percentage was calculated after thenumber of clones of each ribotype was normalized by the total number ofclones in acne patients (acne index). ^(b)The percentage was calculatedafter the number of clones of each ribotype was normalized by the totalnumber of clones in normal individuals. ^(c)Mann-Whitney-Wilcoxon ranksum test.

Analysis of the top ten ribotypes showed both disease-specific andhealth-specific associations. The three most abundant ribotypes (RT1,RT2 and RT3) were fairly evenly distributed among acne and normalindividuals. However, the next seven major ribotypes were significantlyskewed in their distributions (Table 1). Ribotypes 4, 5, 7, 8, 9, and 10were found predominantly in acne patients, with four of these sixstatistically significantly enriched in acne (p<0.05, Wilcoxon test).Ribotypes 4, 5, and 10 contain a nucleotide substitution G1058C in the16S rDNA sequences, which has previously been shown to confer increasedresistance to tetracycline (Ross et al., 1998; Ross et al., 2001).However, only a small percentage of the subjects in our study harboringthese ribotypes had been treated with antibiotics (FIG. 3), thereforeenrichment of these three ribotypes in the acne group was not correlatedwith antibiotic treatment. This is consistent with previous studies,which showed that previous use of antibiotics was not always associatedwith the presence of antibiotic resistant strains and that some patientswho were not previously treated with antibiotics harbored strainsalready resistant to antibiotics (Coates et al., 2002; Dreno et al.,2001). One ribotype, RT6, although detected in only 11 subjects, wasstrongly associated with normal skin (p=0.025, Wilcoxon test) (Table 1).Its relative abundance in the normal group was similar to that found inthe healthy cohort data from the Human Microbiome Project (HMP) (seeFIG. 3). The percentage of positive subjects (11/52) was similar aswell. Three of the 14 HMP subjects had RT6 found in the anterior nares,and one additional subject had RT6 in the left retroauricular crease.

Based on the distributions of the top ten ribotypes, statisticalanalysis using several different tests showed significant differences inP. acnes population structure between acne and normal skin (FIG. 4).This is consistent with a principal coordinate analysis, where acnesamples and normal skin samples were separated by mostly principalcoordinates 1 and 2 (FIG. 4), explaining 44% and 20% of the variation,respectively.

To examine whether different individuals share similar P. acnespopulation structures, the samples were clustered based on the relativeabundances of the top ten ribotypes. Five main microbiome types wereobserved at the P. acnes strain level (microbiome types I to V). TypesIV and V, which are dominated by P. acnes RT4 and RT5, respectively,were mainly found in acne patients (FIGS. 5 and 6).

The same five main microbiome types were observed in the HMP data andthe data from Grice et al. (Grice et al., 2009) (see FIG. 7).

Genome Sequence Analysis of 71 P. acnes Strains

All of the top ten most abundant ribotypes differ from RT1 by only oneor two nucleotide changes in the 16S rDNA sequence (Table 1). Todetermine whether such small changes in the 16S rDNA sequence reflectthe lineages and evolutionary history at the genome level, 66 P. acnesisolates representing major ribotypes 1, 2, 3, 4, 5, 6, and 8 as well astwo minor ribotypes, 16 and 532, were selected for genome sequencing.The genomes of these 66 isolates were fully sequenced and assembled tohigh quality drafts or complete genomes with 50× coverage or more. Fiveother P. acnes genomes, KPA171202 (Bruggemann et al., 2004), J165, J139,SK137, and SK187, were publicly-available and were included in theanalysis. A phylogenetic tree based on 96,887 unique single nucleotidepolymorphism (SNP) positions in the core genome obtained from these 71P. acnes genomes was constructed. Most of the genomes with the sameribotypes clustered together. The tree indicates that the 16S rDNAribotypes do represent the relationship of the lineages to a largeextent and that 16S rDNA sequence is a useful molecular marker todistinguish major P. acnes lineages (FIGS. 8 and 9).

Genetic Elements Detected in P. acnes

A comparative genome analysis among all 71 genomes grouped by ribotypeswas performed. The analysis revealed genetic elements by whichacne-associated strains could contribute to acne pathogenesis and theelements by which health-associated strains could contribute tomaintaining skin health. Specifically, now known are the unique genomeregions of RT4 and RT5, which had a strong association with acne, andRT6, which was found enriched in normal skin. Three distinct regions,loci 1, 2, and 3, were found almost exclusively in strains that belongto clade IA-2 in the phylogenetic tree. Clade IA-2 consists of mainlyRT4 and RT5 (FIGS. 8 and 10). Loci 1 and 2 are located on thechromosome. Locus 1 contains prophage-related genes and appears to be agenomic island. Locus 2 has plasmid integration sites and may be derivedfrom a plasmid sequence. Locus 3 appears to be on a large mobile geneticelement, likely a plasmid. The plasmid is approximately 55 Kb long andhas inverted terminal repeats according to the finished genome HL096PA1.The sequence data suggest that the plasmid is linear and possiblyoriginated from a phage (Hinnebusch and Tilly (1993)). All but one ofthe fifteen genomes of RT4 and RT5 have at least 60% of the genes of theplasmid represented, and all of them have regions homologous to theinverted terminal repeats in the plasmid, suggesting that they harborthe same or a similar linear plasmid (FIG. 8). The copy number of theplasmid in the genomes ranges from 1 to 3 based on genome sequencingcoverage, which was confirmed by quantitative PCR (FIGS. 11 and 12).

The fact that acne-enriched RT4 and RT5 strains carry a linear plasmidand two unique loci of genomic islands indicates that these plasmid andchromosomal regions play a role in acne pathogenesis. In fact, thelinear plasmid encodes a tight adhesion (Tad) locus, which has beensuggested to play a role in virulence in other organisms (Kachlany etal., 2000; Schreiner et al., 2003). The complete Tad locus is found inall but one of the fifteen genomes of RT4 and RT5, and is onlyoccasionally found in other ribotypes. Additionally, in locus 2, a Saggene cluster is encoded, which has been shown to contribute to hemolyticactivity in pathogens (Fuller et al., 2002; Humar et al., 2002; Nizet etal., 2000). FIG. 6 summarizes the genes that are mostly unique to RT4and RT5, several of which play essential roles in virulence in otherorganisms. Some of these genes encoded in RT4 and RT5 increasevirulence, promote stronger adherence to the human host, or induce apathogenic host immune response.

In the genome comparison analysis, it was found that all the genomes ofRT2 and RT6 encode Clustered Regularly Interspaced Short PalindromicRepeats (CRISPR). Among the sequenced genomes, RT2 and RT6 are the onlyribotypes encoding CRISPR. CRISPR have been shown to confer protective“immunity” against viruses, phages, and plasmids (Horvath and Barrangou,2010; Makarova et al., 2011). The CRISPR locus encoded in P. acnesconsists of a series of cas genes—cas3, cse1, cse2, cse4, cas5e, cse3,cas1, and cas2, which are homologous to the CRISPR locus reported in E.coli (Figure S10) and the CRISPR4 locus in Streptococcus thermophilus(Horvath and Barrangou, 2010).

CRISPR arrays are composed of a cluster of identical repetitivesequences separated by spacer sequences of similar length but withdifferent nucleotide sequences. It has been found that spacer sequencesare identical or with one or two mismatches to phage or plasmid DNAsequences. A total of 39 spacer sequences were found in eight P. acnesstrains, 25 of which were unique as shown in Table 2.

TABLE 2 CRISPR spacer sequences found in the genomes of RT2 and RT6Spacer Match Ribotype Strain number Spacer sequence BLAST result foundRT2 HL001PA1 1 CATGGCCTGCACACCAGGCGCTTTTAGCACCT No hits 2CATGGCCTGCACACCAGGCGCTTTTAGCACCT No hits 3CATGGCCTGCACACCAGGCGCTTTTAGCACCT No hits 4GGCGTATGACGAGTTGTGGTCGGCGTTTCCTCP. acnes phage PA6 gp15 (minor tail protein) 5CGGTGTTAACGGCTTGCCTGGCTTGGATGGAG No hits RT2 HL060PA1 1CGCCTACCGTCAGCTGACTCACGCCTCCGCGTT No hits 2TCACACCAGTCATCAGCGTCATAGTCCTCTCGG No hits RT2 HL082PA2 1GGCTCAGCCCTGCCCGATGCCTACGCCAAATGGC. leptum DSM 753 CLOLEP_00129 (cell wall-associated Locus 3hydrolases (invasion-associated proteins)) 2TCACACCAGTCATCAGCGTCATAGTCCTCTCGG No hits RT2 HL103PA1 1CACCGGGCCCATCCCGGTCGGCCTCCTGAAAGG C. leptum DSM 753 CLOLEP_00135 Locus 3RT2 HL106PA1 1 GATCGAGTTGGCTGAGTCGAAGGTGTTGCGGTTP. acnes phage PA6 gp16 (conserved protein) P. acnes phage PAD20 gp16 2CTGCTCATCGCTCAGCTCCTGCGCCTCATCACA No hits 3CTGCGCCAACAGCCGCATCTGATCCGAATACGGP. acnes phage PA6 gp3 (phage portal protein) 4CGCAGCAATCTCAGAAGGCCACAACAAGTTCGTP. acnes phage PA6 gp7 (conserved protein) P. acnes phage PAD20 gp7P. acnes phage PAS50 gp7 5 CAAATCACCCAAGCCCAACACGCCGCCACCACC No hits 6TGTCACCGATTCAATGTATCTATGAGTGGTGTA No hits 7TTGGGTGGGTGAGGTCGGGTCGTCAGTCATGAG No hits 8GTCGATGTCGAGATTGGCCTGGGGGTCCATGTCClostridium leptum DSM 753 CLOLEP_00142 Locus 3 9ACGTCGTGAACGTACCCCTTGACGGAGACGGCA No hits RT2 J139 1CGAGGGCTACCACGTGGTCGATTTGGACTGTCG C. leptum DSM 753 CLOLEP_00167 Locus 2P. acnes SK137 HMPREF0675_3193 (domain of unknown function) 2CAGGCGCTCCACTCCCTCGCCCTGGCCACCAAC No hits RT6 HL110PA3 1CTATGTGGACAGTGTTGGTTACTGTGGGGGGAAP. acnes phage PA6 intergenic region between gp45 and gp46 HL110PA4 2GCACTGCACCGATATCGTCGTGGCTGTCACTTG No hits 3CCCAGACAACCTCGACAACCTGTTCAGGGGATG P. acnes phage PAS50 gp25 4CATGGCTAGCCCGGATTTTTGGCTGCCTGAGCGP. acnes phage PA6 gp34 (mutidrug resistance protein-like transporters)P. acnes phage PAD20 gp34 (DNA helicase) 5CGGCCTGCGGCAGATTTTTGTTGCGTTGAATCCP. acnes phage PA6 gp14 (tape measure protein)P. acnes phage PAD20 gp14 (tape measure protein)P. acnes phage PAS50 gpl4 (tape measure protein) 6CGGGCAGAGGATGTGTTGCTCGTTCCTGGATGGP. acnes phage PA6 gp32 (CHC2 zinc finger)P. acnes phage PAD20 gp32 (DNA primase)P. acnes phage PAS50 gp32 (DNA primase) 7GTTACGCTGGAACCCCCAATGAACACGCGAGAAP. acnes phage PAD42 major head protein geneP. acnes phage PAD20 major head protein geneP. acnes phage PAD9 major head protein geneP. acnes phage PAS40 major head protein geneP. acnes phage PAS12 major head protein (geneP. acnes phage PAS10 major head protein geneP. acnes phage PAD21 major head protein geneP. acnes phage PAS2 major head protein geneP. acnes phage PA6 gp6 (Phage capsid family)P. acnes phage PAS50 gp6 major head protein gene 8CGAGGGCTACCACGTGGTCGATTTGGACTGTCG C. leptum DSM 753 CLOLEP_00167 Locus 2P. acnes 5K137 HMPREF0675_3193 (Domain of unknown function) 9CAGGCGCTCCACTCCCTCGCCCTGGCCACCAAC No hits Abbreviations: BLAST, BasicLocal Alignment Search Tool; CRISPR, Clustered Regularly InterspacedShort Palindromic Repeat; I. leptum, Clostridium leptum; P. acnes,Propionibacterium acnes; RT, ribotype.

As expected, most of the identifiable spacers target to known P. acnesphage sequences. However, among the unique CRISPR spacer sequences, onematched locus 2 on the chromosome and three matched the plasmid region(locus 3) in P. acnes genomes of mainly RT4 and RT5. This suggests thatthese loci may have been acquired by RT4 and RT5 strains, while thegenomes of RT2 and RT6 may be capable of protecting against the invasionof the plasmids or other foreign DNA through the CRISPR mechanism.

Discussion

The foregoing study of the human skin microbiome associated with acneprovides the first portrait of the microbiota of pilosebaceous units atthe bacterial strain level. Since P. acnes is the major skin commensalbacterium found in both acne and healthy skin, this strain-levelanalysis is important to help understand the role of P. acnes in acnepathogenesis and in skin health. A strong association between strains ofRT4 and RT5 with acne and a strong association between strains of RT6and healthy skin, each with unique genetic elements, has been shown.Other P. acnes strains, including ribotypes 7, 8, 9, and 10, orinteractions among different strains, may also contribute to thedevelopment of the disease. In addition, host factors, such as hormonelevel, sebum production, and physical changes in the pilosebaceous unit,may also play a role in acne pathogenesis.

The foregoing metagenomic approach in revealing the association of P.acnes strains with the disease or health is more powerful than previousstudies using traditional methods (Lomholt and Kilian, 2010; McDowell etal., 2011). Because the skin microbiota of each individual and each skinsite may harbor “good,” “neutral,” and “bad” strains at the same time,which may have different growth rates under in vitro culturingconditions, culturing a few isolates from a disease lesion or healthyskin site may not provide an accurate and unbiased measurement of theassociation of the strains with the disease or health. The samplingtechnique and disease associations in the foregoing study did not dependon sampling locations, on the presence of lesions in the sampling field,or on inherently biased culture techniques. While sampling lesional skinintentionally may yield interesting results, these results would not becapable of defining the disease associations that unbiased sampling can.The metagenomic approach employed in the foregoing study to identifyunderlying strain differences in acne may also be applied to the studyof other disease/health associations with commensal or pathogenicbacteria.

Materials and Methods Subjects

Subjects with acne and subjects with normal skin were recruited fromvarious clinics in Southern California including private practice,managed care, and public hospital settings, as well as outside ofdermatology clinics, to best represent the diversity of populations andhistory of medical care. The subject data are available at dbGaP(www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000263.v1.p1).The diagnosis of acne was made by board-certified dermatologists. Thepresence of acne was graded on a scale of 0 to 5 relating closely to theGlobal Acne Severity Scale (Dreno et al., 2011). Grades were recordedfor both the face and the nose separately where zero represents normalskin and 5 represents the most severe inflammatory cystic acne. In acnepatients, the grades of the face ranged from 1 to 5 with an average of2.1, and the grades of the nose ranged from 0 to 2 with an average of0.3. The presence of scarring was also noted. Subjects with normal skinwere determined by board-certified dermatologists and were defined aspeople who had no acneiform lesions on the face, chest, or back. Theywere also excluded if they had other skin problems that theinvestigators felt would affect sampling or the microbial population onthe skin. Among the 101 subjects, 59 were female (31 acne patients and28 normal subjects) and 42 were male (18 acne patients and 24 normalsubjects). The average age of the acne cohort was 22.2 and the averageage of the normal cohort was 29.6. There was no significant differencein ethnicity between the acne and normal populations. The subjectsresponded to a written questionnaire, administered by a physician or awell-trained study coordinator who went over each question with thesubjects. Most of the subjects had not been treated for acne in the pastor were not being treated when samples were collected (FIG. 3). Onlynine out of 78 subjects, who provided treatment information, were beingtreated for acne when samples were taken. Among the nine subjects, twowere being treated with antibiotics, five were being treated withtopical retinoids, one was being treated with both antibiotics andretinoids, and one did not list the treatment. Subjects were asked foracne treatment history in the past (anytime in their life). Eighteen outof 73 subjects, who provided treatment history, had been treated foracne in the past. Among them, seven had been treated with antibiotics,eight had been treated with retinoids, two had been treated with bothantibiotics and retinoids, and one did not list the treatment. Allsubjects provided written informed consent. All protocols and consentforms were approved by both the UCLA and Los Angeles Biomedical ResearchInstitute IRBs. The study was conducted in adherence to the HelsinkiGuidelines.

Samples

Skin microcomedone (white head or black head) samples were taken fromthe nose of the subjects using Bioré Deep Cleansing Pore Strips (KaoBrands Company, Cincinnati, Ohio) following the instruction of themanufacturer. Clean gloves were used for each sampling. After beingremoved from the nose, the strip was immediately placed into a 50 mLsterile tube and kept on ice or at 4° C. The cells were lysed withinfour hours in most of the cases.

Metagenomic DNA Extraction, 16S rDNA Amplification, Cloning, andSequencing

Individual microcomedones were isolated from the adhesive nose stripusing sterile forceps. Genomic DNA was extracted using QIAamp DNA MicroKit (Qiagen, Valencia, Calif.). 16S rDNA was amplified and clonedaccording to the protocol by HMP, which is described in detail inSupplementary Information. Nearly full length sequences were obtained bySanger method.

16S rDNA Sequence Analysis

Base calling and quality was determined with Phred (Ewing and Green,1998; Ewing et al., 1998). Bidirectional reads were assembled andaligned to a core set of NAST-formatted sequences (rRNA16S.gold) usingAmosCmp16Spipeline and NAST-ier (Haas et al., 2011). Suspected chimeraswere identified using ChimeraSlayer and WigeoN (Haas et al., 2011). 16SrDNA sequences were extensively manually examined. Chromatograms werevisually inspected at all bases with a Phred quality score <30.Appropriate corrections were applied. QIIME (Caporaso et al., 2010) wasused to cluster the sequences into OTUs.

P. acnes Isolation and Genotyping

Colonies with the macroscopic characteristics of P. acnes were pickedfrom each sample plate and were passed twice. The ribotype of eachisolate was determined by PCR amplification and sequencing of the fulllength of the 16S rDNA gene by Sanger method.

Whole Genome Shotgun Sequencing, Assembly, and Annotation

Genome HL096PA1 was sequenced using Roche/454 FLX and was assembledusing a combination of PHRAP/CONSED (Gordon et al., 1998) and GSMAPPER(Roche, Branford, Conn.) with extensive manual editing in CONSED. Theremaining 65 genomes were sequenced using Illumina/Solexa GAIIx(Illumina, San Diego, Calif.). Sequence datasets were processed byquality trimming and were assembled using Velvet (Zerbino and Birney,2008). Coding sequences were predicted using GeneMark (Borodovsky andMclninch, 1993) and GLIMMER (Salzberg et al., 1998). The final gene setwas processed through a suite of protein categorization tools consistingof Interpro, psort-b and KEGG. A more detailed protocol can be found athmpdacc.org/doc/sops/reference_genomes/annotation/WUGC_SOP_DACC.pdf.

Comparative Genome Analysis

Seventy-one P. acnes genome sequences were compared using Nucmer (Kurtzet al., 2004). Phylogenetic analysis was performed using MEGA5 (Tamuraet al., 2007). CRISPRFinder (Grissa et al., 2007) was used to identifythe CRISPR repeat-spacer sequences.

Supplementary Information

16S rDNA Sequence of KPA171202

All sequenced P. acnes genomes encode three copies of 16S rRNA genes,which are identical within each isolate, except KPA171202. Based on theKPA171202 genome (Bruggemann et al., 2004), one copy of the 16S rRNAgene has one nucleotide difference from the other two identical copiesof RT1. However, this mutation was never observed in the 16S rDNAdataset. Multiple clones of 16S rDNA gene from KPA171202 were amplified,cloned, and sequenced and a sequence harboring this mutation was notfound. Thus, KPA171202 also has three identical copies of 16S rDNA.

Comparison of P. acnes Strain Distribution to Other Human MicrobiomeDatasets

To determine whether the P. acnes ribotypes and their relativeabundances measured in this study are unique to pilosebaceous units, asimilar analysis to the microbiome 16S rDNA data from the HumanMicrobiome Project (HMP) and the data from Grice et al. (2009) wereapplied. Both datasets were obtained from healthy subjects. The relativeabundance of the major ribotypes in healthy subjects from the study waslargely similar to that found in these two datasets despite the factthat they were sampled from different anatomical sites (FIG. 3). RT6(6.3%) was found to be more abundant than RT4 and RT5 combined (2.8%) inthe HMP data, similar to those found in the normal cohort where RT6represents 4.8% and RT4 and RT5 combined represent 1.2% of the clones.The same five main microbiome types were observed in the two datasets(FIG. 7).

Genome Clustering and Phylogenetic Tree

The recA gene has been widely used to classify P. acnes strains intofour known types: IA, IB, II, and III (McDowell et al., 2008; McDowellet al., 2005). The phylogenetic tree of the 71 genomes based on the SNPsin the core genome matched the recA types perfectly except one isolate,HL097PA1. Most of the genomes with ribotypes 1, 4, 5, and 532 weregrouped to recA Type IA clade, which can be further divided intosubclades IA-1 and IA-2. Clade IA-2 is composed of mostly RT4 and RT5.RT4 and most of RT5 genomes seem to belong to the same lineage with verysimilar genome sequences. All the isolates with ribotypes 3, 8, and 16,who share the mutation of T1007C in the 16S rDNA gene, were grouped torecA Type IB clade. Most of the RT3 genomes form a subclade IB-2 and RT8genomes form a subclade by themselves, IB-1, which was highly associatedwith acne. Notably, RT2 and RT6, who share T854C mutation, have a moredistant phylogenetic relationship to other ribotypes, and were groupedto the recA Type II clade. This is consistent with previous studies(Lomholt and Kilian, 2010; McDowell et al., 2005). P. acnes isolateswith recA type III were not found in the samples.

The associations of P. acnes lineages with health and disease stateswere further analyzed. There was a clear shift of the associationstrength of the clades with acne along the phylogenetic tree (FIG. 9).The three sequenced ribotypes identified as being strongly associatedwith acne (RT4, RT5, and RT8) were found at one end of the tree inclades IA-2 and IB-1, while the RT6 identified as being associated withnormal skin was at the other end of the tree at the tip of clade II(FIG. 9).

Antibiotic Resistance

P. acnes ribotypes 4, 5, and 10 have a single nucleotide substitutionG1058C in the 16S rDNA sequences, which has previously been shown toconfer increased resistance to tetracycline (Ross et al., 1998a; Ross etal., 2001). In addition to the substitution in the 16S rDNA sequences,it was determined that all the strains of RT4 and RT5 that weresequenced have a nucleotide substitution in the 23S rDNA sequences,which confers increased resistance to a different class of antibiotics,erythromycin and clindamycin (Ross et al., 1997; Ross et al., 1998b). Itwas experimentally confirmed that these isolates, except two that wereunculturable, were resistant to tetracycline, erythromycin, andclindamycin.

It was also examined whether the enrichment of these ribotypes in theacne group could be due to antibiotic treatment. However, in the studyonly a small percentage of the subjects harboring ribotypes 4, 5, or 10were treated with antibiotics (Table S2).

TABLE S2 Past and current treatments of the subjects Group Acne NormalNumber of subjects 49 52 with RT4, without with RT4, without RT5, orRT4, RT5, RT5, or RT4, RT5, RT10 and RT10 RT10 and RT10 Number ofsubjects in 20 29 9 43 each subgroup Subjects reported on 14 25 8 31current treatment no treatment 10 21 8 30 antibiotics 0 2 0 0 retinoids3 2 0 0 antibiotics and 0 0 0 1 retinoids unknown 1 0 0 0 Subjectsreported on 12 22 8 31 past treatment no treatment 5 16 6 28 antibiotics2 4 0 1 retinoids 4 1 1 2 antibiotics and 1 0 1 0 retinoids unknown 0 10 0

Eighteen of the 29 subjects who harbored any of these three ribotypesgave reports on both past and current treatments. Among them, 50% (9/18)of the subjects were never treated; 33% (6/18) were treated withretinoids; 11% (2/18) were treated with antibiotics in the past, and5.6% (1/18) were treated with both antibiotics and retinoids in thepast. The theory of selection by antibiotic treatment is not favored bythis study. Previous surveys of antibiotic resistant strains in acnepatients demonstrated that previous use of antibiotics did not alwaysresult in the presence of resistant strains and that some patientswithout previous use of antibiotics harbored resistant strains (Coateset al., 2002; Dreno et al., 2001). Observations in this study areconsistent with previous studies.

CRISPR Spacer Sequences

Although more similar to the GC content of P. acnes genomes, four uniquespacer sequences found in strains of RT2 and RT6 have the best matchesto the genome of Clostridium leptum, a commensal bacterium in the gutmicrobiota (Table 2). On the 55 Kb plasmid harbored in HL096PA1 andother RT4 and RT5 genomes, there is also a large cluster of 35 genesthat are identical to the genes found in C. leptum, including the Tadlocus.

Materials and Methods

Metagenomic DNA Extraction, PCR Amplification, Cloning and 16S rDNASequencing Metagenomic DNA Extraction

Individual microcomedones were isolated from the adhesive nose stripusing sterile forceps and placed in a 2 mL sterile microcentrifuge tubefilled with ATL buffer (Qiagen) and 0.1 mm diameter glass beads (BioSpecProducts, Inc., Bartlesville, Okla.). Cells were lysed using abeadbeater for 3 minutes at 4,800 rpm at room temperature. Aftercentrifugation at 14,000 rpm for 5 minutes, the supernatant wasretrieved and used for genomic DNA extraction using QIAamp DNA Micro Kit(Qiagen). The manufacturer protocol for extracting DNA from chewing gumwas used. Concentration of the genomic DNA was determined by NanoDrop1000 Spectrophotometer.

16S rDNA PCR Amplification, Cloning and Sequencing

Most of the metagenomic samples were amplified in triplicate using 16SrDNA specific primers with the following sequences: 27f-MP5′-AGRGTTTGATCMTGGCTCAG-3′ and 1492r-MP 5′-TACGGYTACCTTGTTAYGACTT-3′.PCR reactions contained 0.5 U/μL Platinum Taq DNA Polymerase HighFidelity (Invitrogen), lx Pre-mix E PCR buffer from Epicentre Fail-SafePCR system, 0.12 μM concentration of each primer 27f-MP and 1492r-MP,and Sigma PCR grade water. One microliter of DNA (ranging from 0.2-10 ngtotal) was added to each reaction. The G-Storm GS4 thermocyclerconditions were as following: initial denaturation of 96° C. for 5minutes, and 30 cycles of denaturation at 94° C. for 30 seconds,annealing at 57° C. for 1 minute, and extension at 72° C. for 2 minutes,with a final extension at 72° C. for 7 minutes. Following amplification,an A-tailing reaction was performed by the addition of 1 U of GOTaq DNAPolymerase directly to the amplification reaction and incubation in thethermocycler at 72° C. for 10 minutes.

The three PCR amplification reactions from each source DNA were pooledand gel purified (1.2% agarose gel stained with SYBR Green fluorescentdye). The 1.4 Kb product was excised and further purified using theQiagen QIAquick Gel Extraction kit. The purified product was cloned intoOneShot E. coli cells using TOPO TA cloning kit from Invitrogen.

White colonies were picked into a 384-well tray containing terrificbroth, glycerol, and kanamycin using a Qpix picking robot. Each tray wasprepared for sequencing using a magnetic bead prep from Agilent andsequenced with 1/16th Big Dye Terminator from ABI. Sequencing was donewith a universal forward, universal reverse, and for a subset, internal16S rDNA primer 907R with sequences of TGTAAAACGACGGCCAGT (forward),CAGGAAACAGCTATGACC (reverse), and CCGTCAATTCCTTTRAGTTT (907R). Sequencereactions were loaded on ABI 3730 machines from ABI on 50 cm arrays witha long read run module.

A slightly different PCR and cloning protocol without automation wasused for several initial samples as described below. 16S rDNA wasamplified using universal primers 8F (5′-AGAGTTTGATYMTGGCTCAG-3′) and1510R (5′-TACGGYTACCTTGTTACGACTT-3′) (Gao et al., 2007). Thermocyclingconditions were as following: initial denaturation step of 5 minutes at94° C., 30 cycles of denaturation at 94° C. for 45 seconds, annealing at52° C. for 30 seconds and elongation at 72° C. for 90 seconds, and afinal elongation step at 72° C. for 20 minutes.

PCR products were purified using DNA Clean and Concentrator Kit (ZymoResearch). Subsequently, the 16S rDNA amplicons were cloned into pCR2.1-TOPO vector (Invitrogen). One-Shot TOP-10 Chemically Competent E.coli cells (Invitrogen) were transformed with the vectors and plated onselective media. Individual positive colonies were picked and inoculatedinto selective LB liquid medium. After 14 hours of incubation, theplasmids were extracted and purified using PrepEase MiniSpin Plasmid Kit(USB Corporation) or Zyppy Plasmid Miniprep Kit (Zymo Research). Theclones were sequenced bidirectionally using Sanger sequencing methodwith ⅛th chemistry using ABI 3730 sequencer (Applied Biosystems Inc.).

P. acnes Isolation and Culturing

Sample Culture Plate

Microcomedones on the inner surface of the nose strip were mashed andscraped using a sterile loop (Fisherbrand, Pittsburgh, Pa.), and platedonto a blood agar plate (Teknova Brucella Agar Plate with Hemin andVitamin K, Teknova, Hollister, Calif.). The plates were incubated at 37°C. for 5-7 days anaerobically using the AnaeroPack System (MitsubishiGas Chemical Company, Tokyo, Japan).

Isolation and Culturing of Individual Strains

Colonies with the macroscopic characteristics of P. acnes were pickedfrom each sample plate and were streaked onto A-media plates (PancreaticDigase of Casine, Difco yeast extract, glucose, KH2PO4, MgSO4, DifcoAgar, and water). These first-pass plates were then incubatedanaerobically at 37° C. for 5-7 days. As the second pass, singleisolated colonies were picked from the first-pass plates and streakedonto new A-Media plates. These plates were then incubated anaerobicallyat 37° C. for 5-7 days. The colonies on these plates were picked forculturing, genotyping, and genome sequencing in the subsequence steps.

Genotyping of the P. acnes Isolates

Each isolate was analyzed by PCR amplification of the 16S rDNA gene. Theribotypes were determined based on the full length sequences. Isolateswith desired ribotypes were selected for future culturing and genomesequencing.

Genomic DNA Extraction of P. acnes Isolates

Isolates were grown in 5 mL of Clostridial medium under anaerobicconditions at 37° C. for 5-7 days. Cultures were pelleted bycentrifugation and washed with 3 mL phosphate buffer saline (PBS). Thesame protocol used for the metagenomic DNA extraction was used forextracting the genomic DNA of the isolates.

Metagenomic Shotgun Sequencing and Analysis

Metagenomic DNA samples from microcomedone samples from 22 individualswith normal skin were pooled and sequenced using Roche/454 FLX. Theaverage read length was 236 bp. The sequencing was limited with 13,291sequence reads. Sequence reads were aligned against the NCBI'snon-redundant database using BLAST. Species assignment was based on 97%identity and 100% of the read length aligned.

Assembly, Alignment and Editing of 16S rDNA Sequences Assembly andAlignment

Base calling and quality were determined with Phred (Ewing and Green,1998; Ewing et al., 1998) using default parameters. Bidirectional readswere assembled and aligned to a core set of NAST-formatted sequences(rRNA16S.gold) using AmosCmp16Spipeline and NAST-ier, which are from theMicrobiome Utilities Portal of the Broad Institute(microbiomeutil.sourceforge.net/). These tools in turn use Amoscmp (Popet al., 2004), Mummer (Kurtz et al., 2004), Lucy (Chou and Holmes,2001), BLAST (Altschul et al., 1990) and CdbTools(compbio.dfci.harvard.edu/tgi/software/). Suspected chimeras wereidentified using ChimeraSlayer and WigeoN (Haas et al., 2011). Sequenceswith at least 90% bootstrap support for a chimeric breakpoint(ChimeraSlayer) or containing a region that varies at more than the 99%quantile of expected variation (WigeoN) were removed from furtheranalysis.

Quality Screening

For diversity analysis of the P. acnes population, sequences with atleast 99% identity over 1,400 nucleotides to P. acnes KPA171202(Bruggemann et al., 2004) 16S rDNA were trimmed to positions 29-1483(numbering based on the E. coli system of nomenclature (Brosius et al.,1978)). Sequences without full coverage over this region were excludedfrom further strain level analysis. Chimera screening, as describedabove, resulted in removal of less than 0.35% of the sequences. This maybe an under-estimation of the chimeras, since the majority of sequencesdiffer by only 1 or 2 nucleotides. Low quality sequences were excluded,defined as more than 50 nucleotides between positions 79 and 1433 withPhred quality scores of less than 15. To allow detailed strain-levelanalysis, the data were extensively manually edited. Chromatograms werevisually inspected at all bases with a Phred quality score <30, andappropriate corrections were applied. For analysis at the species level,the 16S rDNA sequences were not manually edited. Chimera screening ofassembled sequences resulted in removal of less than 0.65% of thesequences. Aligned sequences were trimmed to E. coli equivalentpositions 29-1483 (Brosius et al., 1978). Sequences without fullcoverage over this region were excluded from further analysis.

Sequence Editing

Nearly 62,000 Sanger sequence reads representing the 26,446 assembled P.acnes sequences were mapped to the RT1 sequence in CONSED (Gordon, 2003;Gordon et al., 1998). Comprehensive semi-manual editing of the largenumber of sequences was made feasible by their very high pairwisesimilarities: a median of only one nucleotide change from RT1 persequence (three nucleotide changes prior to editing). Editing wasfacilitated by the use of scripts and the custom navigation feature ofCONSED allowing single click jumps to sites requiring inspection.Chromatograms were inspected for all low quality (Phred <30) bases thatdiffered from RT1, and corrected as needed, including many commonlyoccurring sequence errors. In order to minimize the effect of basemis-incorporation and chimera, specific base differences from RT1occurring in less than 4 sequences (frequency <0.00015) were consideredunreliable and reverted to the corresponding RT1 base. Ribotypes wereassigned for the resulting sequences based on 100% identity.

16S rDNA Sequence Analysis

OTUs and Taxonomy Assignments

QIIME (Caporaso et al., 2010b) was used to cluster the sequences intoOTUs using 99% identity cutoff, furthest neighbor, and UCLUST (Edgar,2010). Representative sequences (most abundant) were selected andaligned using PYNAST (Caporaso et al., 2010a) to the greengenesdatabase. Taxonomy was assigned using RDP method (Cole et al., 2009).The alignment was filtered with the lanemask provided by greengenes, anda phylogenetic tree was built using FastTree (Price et al., 2009).

Wilcoxon Test on the Top Ten Ribotypes

For each sample, the number of clones of each of the top ten ribotypeswas normalized by the total number of P. acnes clones of the sample. Thenormalized counts were used to test the significance in enrichmentbetween the acne group and the normal group. The function wilcox_test inthe R program (www.R-project.org) was used to calculate the p-values.

Microbiome Type Assignments

Microbiome types were assigned based on the largest clades seen whensamples were clustered using thetayc similarity in MOTHUR (Schloss etal., 2009) (FIGS. 5 and 6) or hierarchical clustering (Eisen et al.,1998) (FIG. 7).

Assigning Ribotypes to Datasets of HMP and Grice et al. 2009

Sequences were assigned to a ribotype if they met the followingcriteria. First, there was a single best match. Second, it covered therange required to discriminate between the top 45 ribotypes (58-1388).Third, there were no Ns at discriminatory positions. Lastly, there wereno more than ten non-discriminatory differences.

The HMP 16S rDNA Sanger sequence dataset was downloaded with permissionfrom the HMP Data Analysis and Coordination Center. It has 8,492 P.acnes sequences from 14 subjects and nine body sites (retroauricularcrease, anterior nares, hard palate, buccal mucosa, throat, palatinetonsils, antecubital fossa, saliva, and subgingival plaque). Moredetails on the dataset can be found atwww.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000228.v2.p1.In this dataset, low quality bases (Phred quality <20) were converted toNs, and 26% of the sequences were not assigned due to excessive Ns or Nsat ribotype discriminatory sites. Less than 1% was unresolved due toequal best matches or greater than ten mismatches to RT1.

The dataset from Grice et al. (2009) is available at NCBI (GenBankaccession numbers GQ000001 to GQ116391). It has 22,378 P. acnessequences from ten subjects and 21 skin sites (buttock, elbow,hypothenar palm, volar forearm, antecubital fossa, axillary vault,gluteal crease, inguinal crease, interdigital web space, nare, plantarheel, popliteal fossa, toe web space, umbilicus, alar crease, back,external auditory canal, glabella, manubrium, occiput, andretroauricular crease). Three percent of the sequences were unassigneddue to greater than ten mismatches to RT1, and 1.6% was unassigned dueto equal best matches.

For comparison purpose, the unedited 16S rDNA sequences were assigned toribotypes by the same method described above and the result is shown inFIG. 3. Less than 0.6% of the sequences were unassigned due to greaterthan ten mismatches to RT1, and 1.7% was unassigned due to equal bestmatches.

Whole Genome Shotgun Sequencing, Assembly and Annotation of 66 P. acnesIsolates

Genome HL096PA1

The genome was sequenced using Roche/454 FLX at the UCLA Genotyping andSequencing Core. A total of 590,054 sequence reads were generated withan average read length of 230 bp. Of these, 433,896 were assembled intotwo contigs, a circular main chromosome of 2,494,190 bp and a linearplasmid of 55,585 bp. Assembly was accomplished by a combination ofPHRAP/CONSED (Gordon et al., 1998) and GSMAPPER (Roche) with extensivemanual editing in CONSED. GeneMark v2.6r (Borodovsky and Mclninch, 1993)and GLIMMER v2.0 (Salzberg et al., 1998) were used to performed abinitio protein coding gene prediction. tRNAScan-SE 1.23 was used fortRNA identification and RNAmmer was used for predicting ribosomal RNAgenes (5S, 16S, and 23S). Genome annotation results were based onautomated searches in public databases, including Pfam(pfam.jouy.inra.fr/), KEGG (www.genome.jp/kegg), and COG(www.ncbi.nlm.nih.gov/COG/). Manual inspection of the annotation wasalso performed.

Genomes of the Other 65 Isolates

The genomes were sequenced using Illumina/Solexa Genome Analyzer IIx andannotated by the Genome Center of Washington University at St. Louis.

Assembly

Each genomic DNA sample was randomly sheared and an indexed library wasconstructed using standard Illumina protocols. Twelve uniquely taggedlibraries were pooled and run on one lane of a GAIIx flowcell and pairedend sequences were generated. Following deconvolution of the taggedreads into the separate samples, datasets were processed using BWA (Liand Durbin, 2009) quality trimming at a q10 threshold. Reads trimmed toless than 35 bp in length were discarded and the remaining reads wereassembled using oneButtonVelvet, an optimizer program that runs theVelvet assembler (Zerbino and Birney, 2008) numerous times over a usersupplied k-mer range while varying several of the assembler parametersand optimizing for the assembly parameter set which yields the longestN50 contig length.

Annotation

Coding sequences were predicted using GeneMark v3.3 (Borodovsky andMclninch, 1993) and GLIMMER v2.13 (Salzberg et al., 1998). Intergenicregions not spanned by GeneMark and GLIMMER were aligned using BLASTagainst NCBI's non-redundant database and predictions were generatedbased on protein alignments. tRNA genes were determined usingtRNAscan-SE 1.23 and non-coding RNA genes were determined by RNAmmer-1.2and Rfam v8.0. The final gene set was processed through a suite ofprotein categorization tools consisting of Interpro, psort-b and KEGG.The gene product naming comes from the BER pipeline (JCVI). A moredetailed standard operating protocol (SOP) can be found athmpdacc.org/doc/sops/reference_genomes/annotation/VVUGC_SOP_DACC.pdf.

71 P. acnes Genome Analysis and Comparison Identification of the CoreRegions of P. acnes Genomes

The “core” regions were defined as genome sequences that are present inall 71 genomes. P. acnes KPA171202 was used as the reference genome.Each of the other 70 genome sequences (a series of contigs in most ofthe genomes and two complete genomes) was mapped to the reference genomeusing Nucmer (Kurtz et al., 2004). All the 70 “.coords” output files ofNucmer program were analyzed to identify overlap regions based on theKPA171202 coordinates using a Perl script. Finally, “core” sequenceswere extracted based on the genome sequence of KPA171202 with thecoordinates calculated above. On average, 90% (ranging from 88% to 92%)of the genomes were included in the core regions.

Identification of SNPs in the Core Regions

Single nucleotide polymorphisms (SNPs) were identified by using“show-snps” utility option of the Nucmer program (Kurtz et al., 2004)with the default settings. Genome sequence of P. acnes KPA171202 wasused as the reference genome. All the 70 “.snps” output files of Nucmerprogram were analyzed to identify unique SNP positions based on theKPA171202 coordinates using a Perl script. The SNPs in the core regionswere further analyzed to construct a phylogenetic tree.

Phylogenetic Tree Construction

The 71 concatenated sequences of the 96,887 SNP nucleotides in the coreregions were used to construct a phylogenetic tree of the P. acnesgenomes. The evolutionary distance of the core regions among the genomeswas inferred using the Neighbor-Joining method (Saitou and Nei, 1987).The bootstrap tree inferred from 1,000 replicates was taken. Branchescorresponding to partitions reproduced in less than 80% bootstrapreplicates were collapsed. FIG. 8 shows only the topology. In FIG. 9,the tree was drawn to scale, with branch lengths in the same units asthose of the evolutionary distances used to infer the phylogenetic tree.The evolutionary distances were computed using the p-distance method andare in the units of the number of nucleotide differences per site. Thistree shows the comparison based on only the core regions. The distancedoes not represent the true evolutionary distance between differentgenomes, since the non-core regions of each genome were not consideredhere. All positions containing gaps and missing data were eliminated.Evolutionary analysis was conducted using MEGA5 (Tamura et al., 2007).

Gene Content Comparison

In order to assess the conservation of gene content across the 71genomes, protein coding genes in all the genomes were clustered usingUCLUST (Edgar, 2010) by first sorting by decreasing length thenclustering each sequence to an existing seed sequence if it had at least90% nucleotide identity over its entire length, otherwise it became anew seed. For visualization, the data were reformatted to columns androws representing genes and genomes, respectively. One or more copies ofthe genes in a genome were treated as present. Gene columns were sortedby their position based on the coordinates of the HL096PA1 genome, afully finished genome with a 55 Kb plasmid. Genome rows were sorted bytheir positions in the SNP-based Neighbor Joining tree described above.

Identification of CRISPR/Cas

CRISPRFinder (Grissa et al., 2007) was used to identify the CRISPRrepeat-spacer sequences. The annotation of HL110PA3 was used for BLASTalignment in order to identify the presence of CRISPR/Cas structure andCRISPR repeat-spacer sequences in strains of HL001PA1, HL060PA1,HL082PA2, HL103PA1, HL106PA1, HL110PA4 and J139. Each spacer sequencewas annotated by BLAST alignment against NCBI's non-redundant nucleotidedatabase and the reference genomic sequences database (refseq_genomic).

Sequence Coverage Analysis

MAQ (Li et al., 2008) was used to map the raw sequence reads fromIllumina/Roche platform to the reference genomes. Briefly, “map” commandwas used for mapping, and “assemble” command was used for calling theconsensus sequences from read mapping, then “cnd2win” command was usedto extract information averaged in a tilling window. A window size of1,000 bp was used. Randomly selected 1 million reads were used formapping. This accounted for approximately 40× coverage for all thegenomes except HL096PA2, HL096PA3, HL097PA1 and HL099PA1, which hadapproximately 55× to 75× coverage. BWA (Li and Durbin, 2010) was used tomap the raw sequence reads from Roche/454 platform to the referencegenome HL096PA1. The average coverage was calculated in 1,000 bp window.

Quantitative PCR

Quantitative PCR (qPCR) targeting TadA on the plasmid (Locus 3) andhousekeeping genes Pak and RecA on the chromosome was performed usingthe genomic DNA extracted from the P. acnes isolates. LightCyler 480High Resolution Melting Master kit was used (Roche Diagnostics GmbH,Mannheim, Germany). Each 10 μL reaction solution was consisted of 5 μLmaster mix (2× concentrate), 1 μL 25 mM MgCl2, 0.5 μL 4 μM forward andreverse primers, and DNA template. Four qPCR runs were performed onRoche LightCycler 480. Primer sequences for TadA are5′-GATAATCCGTTCGACAAGCTG-3′ (forward) and 5′-ACCCACCACGATGATGTTT-3′(reverse). Primer sequences for pak are 5′-CGACGCCTCCAATAACTTCC-3′(forward) and 5′-GTCGGCCTCCTCAGCATC-3′ (reverse). Primer sequences forrecA are 5′-CCGGAGACAACGACAGGT-3′ (forward) and5′-GCTTCCTCATACCACTGGTCATC-3′ (reverse). All samples were run induplicates in each qPCR run, except the second run, which was notduplicated. Thermocycling conditions were as following: initialactivation step of 10 minutes at 95° C.; 50 amplification cycles witheach consisting of 10 seconds at 95° C., 15 seconds at 65° C. in thefirst cycle with a stepwise 0.5° C. decrease for each succeeding cycle,and 30 seconds at 72° C.; and final melting curve step starting at 65°C. and ending at 99° C. with a ramp rate of 0.02° C./s and acquisitionrate of 25/° C. DNA concentration standards were run in duplicates. Copynumber ratios of genes were calculated based on the concentrations ofthe genes on the plasmid and chromosome.

Data Availability

16S rDNA sequences have been deposited at GenBank under the project ID46327. Whole genome shotgun sequences and annotations of the P. acnesstrains have been deposited at GenBank under the accession numbersADWB00000000, ADWC00000000, ADWF00000000, ADWH00000000, ADWI00000000,ADXP00000000, ADXQ00000000, ADXR00000000, ADXS00000000, ADXT00000000,ADXU00000000, ADXW00000000, ADXX00000000, ADXY00000000, ADXZ00000000,ADYA00000000, ADYB00000000, ADYC00000000, ADYD00000000, ADYE00000000,ADYF00000000, ADYG00000000, ADYI00000000, ADYJ00000000, ADYK00000000,ADYL00000000, ADYM00000000, ADYN00000000, ADY000000000, ADYP00000000,ADYQ00000000, ADYR00000000, ADYS00000000, ADYT00000000, ADYU00000000,ADYV00000000, ADYW00000000, ADYX00000000, ADYY00000000, ADYZ00000000,ADZA00000000, ADZB00000000, ADZ000000000, ADZD00000000, ADZE00000000,ADZF00000000, ADZG00000000, ADZH00000000, ADZI00000000, ADZJ00000000,ADZK00000000, ADZL00000000, ADZM00000000, ADZN00000000, ADZ000000000,ADZP00000000, ADZQ00000000, ADZR00000000, ADZS00000000, ADZT00000000,ADZV00000000, ADZW00000000, CP003293, and CP003294.

Example 2—Pan-Genome and Comparative Genome Analysis ofPropionibacterium Acnes

Propionibacterium acnes is a major human skin bacterium. To understandwhether different strains have different virulent properties and thusplay different roles in health and diseases, the genomes of 82 P. acnesstrains, most of which were isolated from acne or healthy skin, werecompared. Lineage-specific genetic elements were identified that mayexplain the phenotypic and functional differences of P. acnes as acommensal in health and as a pathogen in diseases. By analyzing a largenumber of sequenced strains, an improved understanding of the geneticlandscape and diversity of the organism at the strain level and at themolecular level is provided.

Introduction

Propionibacterium acnes is a major commensal of the human skin. Itcontributes to maintaining the skin health by inhibiting the invasion ofcommon pathogens, such as Staphylococcus aureus and Streptococcuspyogenes. It does so by hydrolyzing triglycerides and releasing freefatty acid that contributes to the acidic pH of the skin surface (1). Onthe other hand, P. acnes has been historically linked to acne vulgaris,a chronic inflammatory disease of the pilosebaceous unit affecting morethan 85% of adolescents and young adults (2). A metagenomic studypreviously demonstrated that P. acnes was a dominant bacterium in thepilosebaceous unit in both healthy individuals and acne patients (3, 4).At the strain level, however, the population structures of P. acnes weredifferent between the two groups. These findings suggested thatmicrobe-related human diseases are often caused by certain strains of aspecies rather than the entire species, in line with the studies ofother diseases (5, 6).

P. acnes has been classified into three distinct types. Studies byJohnson and Cummins (7) first revealed two distinct phenotypes of P.acnes, known as types I and II, that could be distinguished based onserological agglutination tests and cell wall sugar analysis. McDowellet al. (8) differentiated types I and II P. acnes by monoclonal antibodytyping. Furthermore, their phylogenetic analysis of P. acnes strainsbased on the nucleotide sequences of the recA gene and a more variablehemolysin/cytotoxin gene (tly) demonstrated that types I and IIrepresent distinct lineages. Their investigations also revealed thatstrains within the type I lineage could be further split into twoclades, known as types IA and IB (8, 9). An additional phylogeneticgroup of P. acnes, known as type III was described later (10). Recentstudies based on multilocus sequence typing (MLST) further sub-dividedP. acnes into closely related clusters, some of which were associatedwith various diseases including acne (11-13).

The first complete genome sequence of P. acnes, KPA171202, a type IBstrain, provided insights on the pathogenic potential of thisGram-positive bacterium (14). The genome is 2.56 M bp with 60% of GCcontent. It encodes 2,333 open reading frames (ORFs) including multiplegene products involved in degrading host molecules, such as sialidases,neuraminidases, endoglycoceramidases, lipases, and pore-forming factors.However, the sequence of a single genome does not reflect the geneticlandscape of the organism and how genetic variations among strainsdetermine their various phenotypes and pathogenic properties.

To better understand the human microbiome variations at the strainlevel, as part of the Human Microbiome project (HMP) (15, 16),previously generated were the reference genome sequences of 66 P. acnesstrains selected from a collection of over 1,000 strains isolated from acohort of healthy subjects and acne patients (4). These 66 strainsrepresent the major lineages of P. acnes found on the human skin,including types IA, IB, and II. To cover all the main P. acnes lineagesin the analysis, three additional P. acnes strains were sequenced,including the first available type III P. acnes genome. Thirteen P.acnes genomes sequenced by other research groups (14, 17-22) were alsoavailable at the time of analysis. With a total of 82 genomes, performedwas a comparative genome analysis to characterize the pan-genome of P.acnes, the phylogenetic relationships among different lineages, themicroevolution of the strains in the same individual microbiome, and thegenetic elements specific to each lineage and their associations withhealth and disease.

Results

P. acnes Strains and General Genome Features

To understand the genomic diversity of this important skin commensal atthe strain level, the genomes of 69 sequenced P. acnes strains wereanalyzed. Among them, 67 P. acnes strains were isolated from the skin ofhealthy individuals and acne patients (3, 4), and two P. acnes strains,HL201PA1 and HL202PA1, were isolated from refractory endodontic lesions(23) (Table 2-1).

TABLE 2-1 General features of the 82 P. acnes genomes Sequencing GenomeNumber Genome Strain name Origin Ribotype recA type method size (Mb) GC% of ORFs 1 HL001PA1 Skin 2 II Illumina 2.49 60.0 2,661 2 HL002PA1 Skin3 IB Illumina 2.48 60.1 2,549 3 HL002PA2 Skin 1 IA Illumina 2.48 60.12,594 4 HL002PA3 Skin 1 IA Illumina 2.48 60.1 2,565 5 HL005PA1 Skin 4 IAIllumina 2.53 60.2 2,724 6 HL005PA2 Skin 1 IA Illumina 2.48 60.0 2,645 7HL005PA3 Skin 1 IA Illumina 2.48 60.1 2,579 8 HL005PA4 Skin 3 IBIllumina 2.47 60.0 2,607 9 HL007PA1 Skin 4 IA Illumina 2.53 60.2 2,69110 HL013PA1 Skin 3 IB Illumina 2.48 60.0 2,618 11 HL013PA2 Skin 1 IAIllumina 2.48 60.1 2,588 12 HL020PA1 Skin 1 IA Illumina 2.48 60.1 2,55413 HL025PA1 Skin 1 IB Illumina 2.54 60.1 2,581 14 HL025PA2 Skin 3 IBIllumina 2.48 60.0 2,616 15 HL027PA1 Skin 3 IB Illumina 2.53 60.1 2,71116 HL027PA2 Skin 1 IA Illumina 2.48 60.1 2,629 17 HL030PA1 Skin 1 IBIllumina 2.54 60.0 2,662 18 HL030PA2 Skin 3 IB Illumina 2.51 60.1 2,64719 HL036PA1 Skin 532 IA Illumina 2.48 60.1 2,575 20 HL036PA2 Skin 532 IAIllumina 2.48 60.1 2,565 21 HL036PA3 Skin 1 IA Illumina 2.48 60.1 2,60122 HL037PA1 Skin 3 IB Illumina 2.48 60.1 2,617 23 HL038PA1 Skin 4 IAIllumina 2.54 60.2 2,663 24 HL042PA3 Skin 6 II Roche/454 2.53 60.1 2,61025 HL043PA1 Skin 5 IA Illumina 2.53 60.2 2,698 26 HL043PA2 Skin 5 IAIllumina 2.53 60.2 2,688 27 HL045PA1 Skin 4 IA Illumina 2.53 60.2 2,69228 HL046PA1 Skin 3 IB Illumina 2.48 60.0 2,599 29 HL046PA2 Skin 1 IAIllumina 2.53 60.1 2,692 30 HL050PA1 Skin 3 IB Illumina 2.48 60.0 2,65231 HL050PA2 Skin 1 II Illumina 2.46 60.1 2,581 32 HL050PA3 Skin 3 IBIllumina 2.48 60.0 2,558 33 HL053PA1 Skin 4 IA Illumina 2.53 60.2 2,63234 HL053PA2 Skin 8 IB Illumina 2.51 60.1 2,664 35 HL056PA1 Skin 4 IAIllumina 2.48 60.1 2,581 36 HL059PA1 Skin 16 IB Illumina 2.48 60.1 2,57037 HL059PA2 Skin 16 IB Illumina 2.48 60.0 2,535 38 HL060PA1 Skin 2 IIIllumina 2.48 60.1 2,601 39 HL063PA1 Skin 1 IA Illumina 2.48 60.1 2,52040 HL063PA2 Skin 3 IB Illumina 2.53 60.0 2,669 41 HL067PA1 Skin 3 IBIllumina 2.53 60.1 2,633 42 HL072PA1 Skin 5 IA Illumina 2.53 60.1 2,59443 HL072PA2 Skin 5 IA Illumina 2.53 60.1 2,672 44 HL074PA1 Skin 4 IAIllumina 2.53 60.2 2,723 45 HL078PA1 Skin 1 IA Illumina 2.58 60.1 2,78546 HL082PA1 Skin 8 IB Illumina 2.50 60.1 2,648 47 HL082PA2 Skin 2 IIIllumina 2.51 60.0 2,644 48 HL083PA1 Skin 1 IA Illumina 2.48 60.1 2,57549 HL083PA2 Skin 3 IB Illumina 2.48 60.0 2,633 50 HL086PA1 Skin 8 IBIllumina 2.53 60.1 2,610 51 HL087PA1 Skin 3 IB Illumina 2.48 60.1 2,58452 HL087PA2 Skin 1 IA Illumina 2.48 60.1 2,572 53 HL087PA3 Skin 3 IBIllumina 2.52 60.1 2,619 54 HL092PA1 Skin 8 IB Illumina 2.50 60.1 2,59055 HL096PA1 Skin 5 IA Roche/454 2.49 60.0 2,393 56 HL096PA2 Skin 5 IAIllumina 2.56 60.1 2,638 57 HL096PA3 Skin 1 IA Illumina 2.56 60.0 2,65158 HL097PA1 Skin 5 IC Illumina 2.52 60.2 2,617 59 HL099PA1 Skin 4 IAIllumina 2.58 60.1 2,735 60 HL100PA1 Skin 1 IA Illumina 2.48 60.1 2,56261 HL103PA1 Skin 2 II Illumina 2.48 60.1 2,546 62 HL106PA1 Skin 2 IIIllumina 2.48 60.1 2,533 63 HL106PA2 Skin 1 IA Illumina 2.48 60.1 2,56764 HL110PA1 Skin 8 IB Illumina 2.51 60.1 2,667 65 HL110PA2 Skin 8 IBIllumina 2.50 60.1 2,614 66 HL110PA3 Skin 6 II Illumina 2.54 60.1 2,80667 HL110PA4 Skin 6 II Illumina 2.54 60.1 2,724 68 HL201PA1 Refractoryendodontic lesion 6 III Illumina 2.48 60.1 2,629 69 HL202PA1 Refractoryendodontic lesion Not assigned II Illumina 2.56 60.0 2,821 70 KPA171202Plate 1 IB Sanger 2.56 60.0 2,297 71 J139 Skin 2 II Roche/454 2.48 60.02,364 72 J165 Skin 1 IA Roche/454 2.50 60.0 2,403 73 SK137 Skin 1 IARoche/454 2.50 60.0 2,352 74 SK187 Skin 3 IB Roche/454 2.51 59.0 2,38175 SK182 Skin 1 IA Roche/454 2.48 60.0 2,338 76 266 Pleuropulmonaryinfection 1 IA Roche/454 2.50 60.0 2,412 77 6609 Skin 1 IB Solid 2.5660.0 2,358 78 ATCC11828 Abscess 2 II Solid 2.49 60.0 2,260 79 P.acn17Comeal scrape 3 IB Solid 2.52 60.0 2,266 80 P.acn31 Aqueous humour 3 IBSolid 2.50 60.0 2,247 81 P.acn33 Aqueous humour 3 IB Solid 2.49 60.02,236 82 PRP-38 Skin 5 IC Solid 2.51 60.0 2,233

These 69 strains cover all the known P. acnes lineages isolated to date.The strains were classified based on their 16S ribosomal RNA (rRNA)sequences. Each unique 16S rRNA sequence was defined as a ribotype (RT).All the sequenced P. acnes genomes had three identical copies of 16SrRNA. Based on the metagenomic study of the skin microbiome associatedwith acne (4), among the top ten major ribotypes, RT1, RT2, and RT3 werethe most abundant and found in both healthy individuals and acnepatients with no significant differences. RT4, RT5, and RT8, however,were enriched in acne patients, while RT6 was mostly found in healthyindividuals. The 69 strains included 19 RT1 strains, five RT2 strains,15 RT3 strains, eight RT4 strains, seven RT5 strains, four RT6 strains,six RT8 strains, four strains of minor ribotypes, and one type Illstrain. The average genome size was 2.50 Mb (ranging from 2.46 to 2.58Mb) and the GC content was 60%. On average, each genome encoded 2,626ORFs (ranging from 2,393 to 2,806) (Table 2-1).

The analysis included 13 additional P. acnes genomes that were publiclyavailable (14, 17-22) (Table 2-1). The average genome size of these 13P. acnes strains was 2.51 Mb (ranging from 2.48 to 2.56 Mb) and the GCcontent was 60%, encoding 2,319 ORFs on average (ranging from 2,233 to2,412). These 13 genomes include six RT1 strains, two RT2 strains, fourRT3 strains, and one RT5 strain, however, no genomes of RT4, RT6, RT8and type Ill strains were available. The sequencing effort significantlyincreased the number of genomes for each P. acnes lineage as well as thenumber of lineages covered.

P. acnes Pan-Genome

To determine the genetic landscape of P. acnes, the pan-genome based onthe 82 P. acnes genomes was estimated. The number of new genes thatwould be discovered by sequencing additional P. acnes genomes by using apower law regression analysis, n=KNy (24), was estimated (FIG. 13). Theanalysis identified that α was 0.788. The average number of new genesadded by a novel genome was three when the 82nd genome was added. Thenumber of P. acnes pan-genes that would be accumulated by sequencingadditional P. acnes genomes by using a power law regression analysis,n=KNγ, was then estimated (FIG. 14). The exponent γ was 0.067, and P.acnes had 3,136 pan-genes (N=82). Based on these results, the pan-genomeof P. acnes is defined as open, as the exponent α was less than one andγ was greater than zero (24). However, since α was close to one and γwas close to zero, it is believed that this organism evolved tightlywithout large expansions.

Phylogenetic Relationships Among the P. acnes Genomes

A genome comparison of the 82 P. acnes strains revealed that 2.20 Mb(88% of the average genome) was shared by all the P. acnes genomes,which are referred to herein as the “core regions.” Within the coreregions, 123,223 unique single nucleotide polymorphisms (SNPs) weredetected among the strains. Twenty seven percent of the SNPs were uniqueto type I, 22% were unique to type II, and 22% were unique to type III(FIG. 15). A phylogenetic tree based on the 123,223 SNPs in the coreregions was constructed (FIG. 16). The tree showed that the recA typeclassification of the strains was consistent with the major clades basedon the genomes. The recA type IA, IB, and II strains were all clusteredtogether within each type, respectively, except HL097PA1 and PRP-38. Theonly recA type III strain, HL201PA1, formed a separate branch from typeI and type II strains. The tree also showed that the 16S rRNA ribotypesof the strains were consistent with the phylogenetic relationshipsinferred from the genome sequences. Most of the RT1 strains wereclustered in clade IA-1, while all the RT4 and most of the RT5 strainswere clustered in clade IA-2. All six RT8 strains were clusteredtogether in clade IB-1. All the RT3 and RT16 strains were clusteredtogether in clade IB-2 except SK187. HL030PA1 and KPA171202 wereclustered together with 6609, as a distinct IB-3 clade. HL097PA1 andPRP-38 were clustered together and were classified as a novel type ICrecently named by McDowell et al. (22). All the RT2 strains wereclustered in clade II, distant from clade I, together with RT6 strains.HL202PA1, which is a RT6 strain and was isolated from an oral site, wasnot much different from the skin RT6 isolates and was clusteredtogether. The sequence types of all the strains were assigned based ontwo published MLST schemes (11,13) and are shown in Table 51. Thephylogenetic tree based on the core genome regions demonstrated that 16Sribotyping can be used for P. acnes strain identification andclassification. It provides a much higher resolution than recA typing,and in the meantime, it is much simpler and faster with only one generequired than MLST, which is a laborious process generally requiring 7-9genes.

The large number of genome sequences that were generated permittedanalyzing the P. acnes pan-genome at the clade level. Clades IA, IB andII had 36, 33 and 12 genomes, respectively. Based on the power lawregression analyses described above, it was determined that at the cladelevel P. acnes also has an open pan-genome for recA type IA clade, typeIB clade and type II clade with limited expansions (FIG. 17). Theexpansion rates were not significantly different among the clades andwere similar to the one at the species level. This suggests that all themajor lineages of P. acnes had evolved at a similar rate.

SNP Distribution in the Core Genome Regions

To understand whether there are “hot spots” for mutation and/orrecombination in the P. acnes genomes, it was determined whether theSNPs were randomly distributed throughout the genomes or were enrichedin particular regions. The frequency of SNPs in each protein coding genein the core regions was calculated. The average rate of polymorphicsites in the core regions was 5.3%, i.e., 5.3 unique SNPs in every 100bp. This rate is comparable to the ones found in multiple gut bacterialgenomes (25). Among the 1,888 genes encoded in the core regions, 55genes had higher SNP frequencies with more than two standard deviations(SD), and 47 genes with more than three SD (FIG. 18(A). Using theKolmogorov-Smirnov (K-S) test, it was demonstrated that these 102 highlymutated genes were not randomly distributed throughout the genome(P<0.01) (FIG. 18(B). This suggests that P. acnes has an evolutionaryrisk management strategy. Based on the Clusters of Orthologous Groups(COG) categories, the functions of these 102 genes showed a similardistribution as those of all 1,888 genes in the core regions. There wasno enrichment of a particular functional category in these frequentlymutated genes.

It was further determined whether the mutations in the core regions wereunder selection by calculating the ratio of non-synonymous (NS) vs.synonymous (S) SNPs for the 1,888 genes. The average rate of NSmutations was 38%. Among the 1,888 genes, 54 genes had higher NSmutation rates with more than two SD and 13 genes with more than threeSD (FIG. 18(C). These 67 genes were randomly distributed in the genomeand not particularly enriched in certain regions (P>0.05 with the K-Stest) (FIG. 18(D). Most of the 102 genes with higher SNP frequencies didnot overlap with these 67 genes, suggesting that independentevolutionary events might lead to these gene alternations. Only tengenes had both high SNP frequencies and high NS mutation rates, allannotated as hypothetical proteins.

Evolutionary Relationships of the Strains Isolated from the SameIndividuals

The large number of P. acnes strains isolated from the cohort of acnepatients and healthy individuals allowed the investigation of whetherthe P. acnes strains in hair follicles from the same individual wereclonal. Based on previous metagenomic analysis, it was demonstrated thatmost individuals harbored multiple P. acnes strains from differentlineages (4). However, it was not known whether the strains of the samelineage in the same individual were derived from the same ancestor.Genome sequences of the strains isolated from the same samples makes itpossible to examine whether the Strains from the Same Individuals (SSIs)evolved from the same origin via clonal expansion. The 69 sequenced P.acnes strains included 49 SSIs: 13 duets (i.e., 13 pairs of strainsisolated from 13 individuals), five trios, and two quartets. Twentythree SSIs were clustered in the same clades, nine in clade IA-1, fourin clade IA-2, two in clade IB-1, six in clade IB-2 and two in clade II.The distance (substitution rate at the 123,223 SNP sites in the coreregions) between each pair of SSIs was calculated (FIG. 16). The averagedistance of the SSIs in clade IA-1 was 0.0014, while that of strainsfrom different individuals in clade IA-1 was 0.0064 (P<0.001).Consistent results were observed in other clades including IA-2, IB-1,IB-2, and II (FIG. 19(A)). This demonstrated that the SSIs in the samelineage were significantly more similar to each other than the strainsisolated from different individuals, suggesting that they were clonal ineach individual. Among the RT4 and RT5 strains within clade IA-2,however, the average distance between SSIs (0.0004) was notsignificantly different from the average distance between strains fromdifferent individuals (0.0017) (P=0.072). Moreover, the average distancebetween RT4/RT5 strains from different individuals (0.0017) was similarto the average distance between the SSIs in clade IA-1 (0.0014), andeven shorter than the average distances between the SSIs in clades IB-1(0.0059), IB-2 (0.0019) and II (0.0022) (Fig. S3A). This suggests thatalthough isolated from different individuals, these RT4 and RT5 strainsseemed to be clonal and had evolved from the same recent ancestor. Asimilar relationship between the two RT5 strains in clade IC wereobserved, where HL097PA1 and PRP-38, isolated from differentindividuals, were closely related to each other with a distance of0.0012. The metagenomic study has demonstrated a strong association ofstrains of RT4 and RT5 with acne (4). The clonality of these strainsisolated from different individuals suggests that RT4 and RT5 strainsmay be transmitted among individuals. This finding is consistent withthe previous clinical report that antibiotic-resistant Propionibacteriawere transmissible between acne-prone individuals includingdermatologists (26), as most of the antibiotic-resistant P. acnesstrains belong to RT4 and RT5 (4,13). The analysis of SSIs furthersupports the theory that RT4 and RT5 strains may be a pathogenic factorin acne.

To determine whether the distances between strain pairs from the sameindividuals but belonging to different lineages were different fromrandom strain pairs, the distances of any pair of SSIs from differentclades were calculated. The average distance of the SSIs between cladesIA-1 and IA-2 (i.e., HL005PA3 vs. HL005PA1, HL005PA2 vs. HL005PA1,HL096PA3 vs. HL096PA1, and HL096PA3 vs. HL096PA2) was 0.039, similar tothat of the isolates from different individuals (0.040). Similar resultswere obtained for all other clade pair comparisons (FIG. 19(B). Theseresults demonstrated that the SSIs from different clades were similarlydifferent from each other as to the strains from different individuals.This analysis suggests that in each individual microbiome P. acnesstrains undergo clonal expansion in the same population, while multiplestrain populations can often co-exist in the same community with littlerecombination.

Non-Core Genome Regions in Type I Strains

By comparing the genome sequences of the 82 P. acnes strains, non-coregenome regions were identified that were not shared by all 82 strains.The total length of the non-core regions was approximately 0.90 Mb. Theaverage GC content of the non-core regions was slightly lower than thatof the core regions, 58%±6.9%, suggesting that part of the non-coreregions might be originated from other species via horizontal genetransfer.

Different lineages of P. acnes strains have distinct non-core regions.Using hierarchical clustering of the non-core regions, it was shown thatthe strains of the same ribotypes were clustered together with distinctseparations among the clades (FIG. 20). Among the non-core regions, thegenetic elements specific to each lineage were identified, which mayexplain the phenotypic and functional differences of the strains inhealth and disease. In clade IA-2, genomic loci 1, 2 and 3 wereidentified, which were unique to mainly the RT4 and RT5 strains (4).These loci appear to be originated from mobile elements, encode severalvirulent genes, and may contribute to the virulence of these strains. Inthe meantime, the genomic island-like cluster 2 (GI2) (18) was uniquelyabsent in most of the strains in this clade. Clade IB-1 consisted of allRT8 strains, which were also highly associated with acne based on ourmetagenomic study (4). They all have a unique genomic island (locus 4),which is 20 Kb long and encodes a series of nonribosomal peptidesynthetases (NRPS), which may contribute to increased virulence of thesestrains. Most RT3 and RT16 strains belong to clade IB-2 and have fewernon-core regions than the strains in other clades. This may be explainedby the lack of entire rearrangement hot spot (RHS) family proteins,which function in genomic rearrangements as previously implicated inEscherichia coli (27). Clade IB-3 consisted of three strains, includingKPA171202. Three of the four genomic islands described previously (18),GI1, GI3 and GI4, were unique to this clade and were absent in all otherstrains. This analysis suggests that KPA171202, although was the firstsequenced complete genome of P. acnes, did not seem to be a common skinP. acnes strain representing one of the major lineages. This result isconsistent with previous studies using MLST (11-13). Strains of clade ICbelong to RT5. They also contain locus 3, a linear plasmid, which ishighly homologous to the locus 3 in the RT4 and RT5 strains of cladeIA-2 and encodes a tight adhesion locus originated from Clostridiumleptum (4). In general, although strains in different lineages had asimilar genome size with similar gain and loss of genetic materials,they harbor distinct genetic elements which may give rise to theirdifferent virulent properties.

Non-Core Genome Regions in Type II Strains

Strains in clade II, mainly RT2 and RT6, were more distantly related tothe strains in clade I. Based on the metagenomic study, strains in cladeII were not associated with acne, as RT2 was evenly distributed betweenacne patients and healthy individuals, while RT6 was significantlyenriched in healthy individuals (4). Compared to type I strains, thegenomes of RT2 and RT6 strains lack several regions, which areapproximately 92 Kb long in total and encode 107 ORFs. RT2 and RT6genomes have additional genomic regions with a similar size encoding 93ORFs (FIG. 20). Based on the COG classification, there were nosignificant differences in the distribution of the functional categoriesbetween the 107 type I specific ORFs and 93 type II specific ORFs.

The most unique genomic feature of RT2 and RT6 strains is represented bythe clustered regularly interspaced short palindromic repeats(CRISPR)/Cas locus (4). CRISPR/Cas system provides acquired bacterialimmunity against viruses and plasmids by targeting nucleic acids in asequence-specific manner (28). All the sequenced strains of RT2 and RT6encoded a complete set of CRISPR/Cas genes and at least one repeat andspacer sequence, while none of the other ribotype strains did. Based onits complete genome sequence (20), strain ATCC11828 appeared to be anexception, having only terminal sequence but no spacer sequence.However, using PCR and sequencing it was determined that ATCC11828 hasone repeat-spacer sequence (Table S2).

TABLE S2CRISPR spacer sequences found in the genomes of ATCC 11828, HL042PA3 and HL202PA1Ribo- BLAST Match Strain type Spacer Protospacer result found inATCC11828 RT2 1 CATCTGCCAACGAGCGAGAGTGGCGCGGTGTTCC No hits 2CGAGGGCTACCACGTGGTCGATTTGGACTGTCGClostridium leptum DSM 753, COLEP_00167, Locus 2Propionibacterium acnes SK137, HMPREF0675_

(Domain of unknown function) 3 CAGGCGCTCCACTCCCTCGCCCTGGCCACCAAC No hits

1 CTGACTGGTTTGGGTCATACGTCTTCTGACACGPropionibacterium acnes phage PA6 gp14 (tape measure protein)Propionibacterium acnes phage PAD

 gp14 (Tape measure protein)Propionibacterium acnes phage PAS50 gp14 (Tape measure protein) 2TCACAGGCCACGCAGGCACATCACCCTTATTAGPropionibacterium acnes phage PA6 gp14 (Minor tail protein)Propionibacterium acnes phage PAD

 gp14 (Minor tail protein)Propionibacterium acnes phage PAS50 gp14 (Minor tail protein) 3CTCCCCCTCCTCCCCGGGAGGAAAAGCAGACCAPropionibacterium acnes phage PAS50 gp14 (Minor tail protein) 4CGAGGGCTACCACGTGGTCGATTTGGACTGTCGClostridium leptum DSM 753, CLOLEP_00167, Locus 2Propionibacterium acnes SK137, HMPREF0675_3193(Domain of unknown function) HL202PA1

1 CGAGGGCTACCACGTGGTCGATTTGGACTGTCGClostridium leptum DSM 753, CLOLEP_00167, Locus 2Propionibacterium acnes SK137, HMPREF0675_3193(Domain of unknown function) 2 CAGGCGCTCCACTCCCTCGCCCTGGCCACCAAC No hits

indicates data missing or illegible when filed

A total of 48 spacer sequences were found in the 11 RT2 and RT6 strains,29 of which were unique. In other bacterial species, it has beenestablished experimentally and computationally that the spacers at theleader-proximal end are more diversified, while the spacers at theleader-distal end are more conserved among strains. The evolutionaryrelationships among the RT2 and RT6 strains based on their shared spacersequences were analyzed. HL060PA1 and HL082PA2, which were clusteredtightly in clade II, shared the same spacer S2 (FIG. 21). J139,ATCC11828, HL110PA4, HL110PA3, and HL202PA1 shared the same spacers S17and S18 (FIG. 21). These results suggest that these groups of strainsprobably evolved from the same ancestors before having acquiredadditional spacers. The relationships among the strains based on sharedCRISPR spacers are consistent with the phylogenetic relationshipscalculated based on the SNPs in the core regions. In addition, multipletype II strains harbored spacer sequences that match to the sequences inloci 2 and 3, which were unique to mainly acne-associated RT4 and RT5strains (4). The sequences in loci 2 and 3 appeared to be originatedfrom C. leptum and encode potential virulence factors (Fig. S4). Theseloci may have been acquired by RT4 and RT5 strains, while the genomes ofRT2 and RT6 that encode these spacers may be capable of eliminating theinvasion of foreign DNA through the CRISPR mechanism (4).

The large number of high quality draft genome sequences enableddetection not only large genomic variations, but also small butessential genomic alterations. It was previously reported that type IIstrains showed decreased lipase activity (10). Lipase functions inhydrolyzing triglycerides and releasing free fatty acids, which isthought to be essential in P. acnes virulence. Based on the genomeannotation, 13 genes were identified with a potential function of lipase(FIG. 22(A)). Among them, detected insertions/deletions ranging from onenucleotide to 13 nucleotides may explain the decreased lipase activityin type II strains. Two triacylglycerol lipases were encoded in tandemin P. acnes genomes, HMPREF0675-4855 and HMPREF0675-4856 (according tothe annotation of SK137). All the type II strains and IB-3 strains had adeletion of the “TATA-box” 20 bp upstream of the start codon of thesecond lipase gene, HMPREF0675-4856 (FIG. 22(B)). In addition, there wasa one-nucleotide deletion at the position of 124G of the second lipasegene, leading to a frameshift and the introduction of a premature stopcodon. These two deletions may potentially explain the decreased lipaseactivity and hence decreased virulence in acne observed in type IIstrains in previous studies (4, 10).

Non-Core Genome Regions in the Type III Strain

Type III strains are rarely found on the skin surface. A type III P.acnes strain isolated from refractory endodontic lesion, HL201PA1, wassequenced. This first available type III genome permitted theidentification of the genetic elements specific to this lineage.Compared to type I and type II strains, the genome of HL201PA1 lacks afew regions with a total length of 43 Kb (FIG. 20). There were 42 ORFsencoded in these regions, including anaerobic dimethyl sulfoxidereductase (PPA0516-PPA0517), iron(III) dicitrate transport systempermease (PPA0792-PPA0793), 3-isopropylmalate dehydratase(PPA1361-PPA1363), and maltose transport system permease(PPA1553-PPA1554).

Discussion

High-throughput genome sequencing and comparative analysis of a largenumber of related strains have been used to study the spread andmicroevolution of several pathogens at the strain level, includingmethicillin-resistant Staphylococcus aureus (29), Streptococcuspneumoniae (30), and Vibrio cholerae in Haiti outbreak (31),demonstrating the power of comparative genome analysis of multiplestrains in improving our understanding of the bacterial pathogens.However, this approach has been rarely applied to study commensalspecies to understand their varied virulent potentials among differentstrains and their roles in both health and diseases.

This study presents a comparative genome analysis of a major skincommensal, P. acnes, based on a large number of sequenced strains. Thiscollection of strains not only includes strains associated with eitherhealthy skin or acne, but also a large number of strain pairs that wereisolated from the same individuals. This allowed the comparison ofphylogenetic relationships and microevolution of the P. acnes strainsassociated with health vs. disease as well as of the strains in the sameindividual microbiome.

By comparing 82 P. acnes genomes, it was shown that all P. acnes strainshad a similar genome size with a similar GC content, encoding 2,577 ORFson average (Table 1). Although P. acnes has an open pan-genome, unlikemany other open-genome species (24), it has limited genome expansionwith only a few new genes added per genome (FIGS. 13 and 14). The rateof genome expansion is similar within the major lineages (FIG. 17).There was limited recombination among different P. acnes strains, andthus 16S rRNA ribotypes can be used as a proxy for P. acnes strainidentification and classification (FIG. 16). Compared to other typingmethods, 16S ribotyping has a much higher resolution than recA typingand is much easier and faster than the traditional MLST method. Thismethod can be applied in a high-throughput manner by combining withnext-generation sequencing, and thus allows one to detect the microbiomevariations at the strain level (4). This is advantageous and important,as identifying and understanding the strain level variations of thehuman microbiome is medically important.

The genomes of the sets of strains isolated from the same individualsamples were compared (FIGS. 16 and 19). This collection of genome datais unique and no such kind of study has been performed to investigatethe microevolution of a human commensal within an individual microbiome.It was found that while multiple P. acnes strain populations co-existedin the same individual microbiome, clonal expansions occurred in eachpopulation with little recombination among different populations. Withineach lineage, the strains isolated from the same individuals weresignificantly more similar to each other than strains from differentindividuals except the disease associated strains, RT4 and RT5 strains(FIG. 19). Although isolated from different individuals, they appearedto be clonal and have evolved from the same virulent ancestor strain(FIG. 16). This supports the observation that these strains weretransmissible (26) and that they may play a role in acne pathogenesis(4). This finding is important and will help control the spread ofantibiotic resistant strains and develop new targeted therapy for acne.

By analyzing the non-core regions, the genomic elements and alterationsspecific to each lineage were identified (FIG. 20). Theselineage-specific elements may render the strains different physiologicaland functional properties and thus lead to their different roles as acommensal in health or as a pathogen in diseases. Among the acneassociated strains, RT4 and RT5 strains encode three distinct locioriginated from mobile elements, and RT8 strains encode a distinctregion containing a set of NRPS. The virulent genes encoded in thesestrain-specific regions may explain the associations of these strainswith acne and help the development of new drugs targeting against thesestrains. RT2 and RT6 strains, which were not associated with acne andwere enriched in healthy skin, respectively, all encode CRISPR/caselements. The CRISPR mechanism may prevent these strains from acquiringvirulent genes from invading foreign mobile elements. In addition, thesestrains contain genomic variations in lipases that may alter lipidmetabolism and reduce their virulence (FIG. 22).

In conclusion, by characterizing the genetic landscape and diversity ofP. acnes with a large number of genomes, genomic evidence that mayexplain the diverse phenotypes of P. acnes strains and a new insightinto the dual role of this commensal in human skin health and disease isprovided. The findings from this comparative genome analysis provide newperspectives on the strain diversity and evolution of commensals in thehuman microbiome. As many current microbiome studies focus on theassociations of microbial communities with health and diseases, thisstudy underscores the importance of understanding the commensalmicrobiome at the strain level (25). The findings from this study alsoshed light on new strain-specific therapeutics for acne and other P.acnes associated diseases.

Materials and Methods

P. acnes Strains

Among the 69 P. acnes strains that were sequenced, 67 were isolated fromthe skin microcomedone samples from acne patients and healthyindividuals (4). The other two strains, HL201PA1 and HL202PA1, wereisolated from refractory endodontic lesions (23), provided by Dr. DavidBeighton at the King's College London.

Whole Genome Shotgun Sequencing, Assembly, and Annotation

The genome of HL042PA3 was sequenced using Roche/454 FLX and assembledusing a combination of PHRAP/CONSED (32) and GSMAPPER (Roche). HL201PA1and HL202PA1 were sequenced using Illumina MiSeq (250 bp, paired-end)and assembled using Velvet (33). The remaining 66 genomes were sequencedpreviously as described (4). Coding sequences were predicted usingGeneMark (34) and GLIMMER (35).

Computation of the Core Regions, Non-Core Regions and the Pan-Genome

The core regions were defined as genome sequences that were present inall 82 genomes, while the non-core regions were defined as genomesequences that were not present in all the genomes. KPA171202 was usedas the reference genome. Each of the other 81 genome sequences (a seriesof contigs in most of the genomes and ten complete genomes) was mappedto the reference genome using Nucmer (36). All the 81 “.coords” outputfiles of Nucmer program were analyzed to identify overlap regions basedon the KPA171202 coordinates using a Perl script. Core sequences werethen extracted based on the genome sequence of KPA171202 with thecoordinates calculated above.

The unique regions from each genome were added to the reference genometo make a “revised” reference genome, which contained the originalsequence plus the unique genome sequences. This process was repeated forall the genomes until all the unique regions from all genomes wereincluded in the pan-genome.

Lastly, core regions were subtracted from the pan-genome. The remainingregions were defined as non-core regions, which are not shared by allthe strains. Protein coding sequences were predicted by GeneMark.hmmusing KPA171202 as a reference file.

Identification of SNPs in the Core Regions

Single nucleotide polymorphisms (SNPs) were identified by using“show-snps” utility option of the Nucmer program with the defaultsettings (36). Genome sequence of KPA171202 was used as the referencegenome. All the 81 “.snps” output files of Nucmer program were analyzedto identify unique SNP positions based on the KPA171202 coordinatesusing a Perl script.

Phylogenetic Tree Construction

The 82 concatenated sequences of the 123,223 SNP nucleotides in the coreregion were used to construct a phylogenetic tree of the P. acnesgenomes. MEGA5 (37) was used to calculate the distance based on the SNPsin the core regions using the Neighbor-Joining method and the p-distancemethod. The bootstrap tree inferred from 200 replicates was taken.

Sequence Type Analysis Based on MLST Schemes

The sequence types of the 82 isolates were determined based on the MLSTschemes published previously (11-13). The MLST gene sequences werealigned using BLAST against all the alleles used in the two MLSTschemes.

Identification of CRISPR/Cas

CRISPRFinder (38) was used to identify the CRISPR repeat-spacersequences. The annotation of HL110PA3 was used for BLAST alignment inorder to identify the presence of CRISPR/Cas structure and CRISPRrepeat-spacer sequences in strains of HL001PA1, HL060PA1, HL042PA3,HL082PA2, HL103PA1, HL106PA1, HL110PA4, HL202PA1, J139 and ATCC11828.Each spacer sequence was annotated by BLAST (39) against NCBI'snon-redundant nucleotide database and the reference genomic sequencesdatabase (refseq_genomic).

Hierarchical Clustering Analysis of the Non-Core Regions

Among the 1,685 non-core fragments (895,905 bp in total), 314 non-corefragments with a length of >500 bp (747,189 bp in total, correspondingto 83% of all the non-core regions) were extracted and used inhierarchical clustering of the non-core regions. Cluster 3.0 program(40) and average linkage method was used. The clustering matrix wascomposed of 314 rows and 82 columns, in which 1 denotes presence of thenon-core region and 0 denotes absence of the non-core region. JavaTreeView program (41) was used to display the clustering result.

REFERENCES

-   1. Grice E A, Segre J A. 2011. The skin microbiome. Nat Rev    Microbiol 9:244-253.-   2. White G M. 1998. Recent findings in the epidemiologic evidence,    classification, and subtypes of acne vulgaris. J Am Acad Dermatol    39:S34-37.-   3. precedings.nature.com/documents/5305/version/1-   4. Fitz-Gibbon S T, Tomida S, Chiu B, Nguyen L, Du C, Liu M,    Elashoff D, Erie M C, Loncaric A, Kim J, Modlin R L, Miller J F,    Sodergren E, Craft N, Weinstock G M, Li H. Propionibacterium acnes    strain populations in the human skin microbiome associated with    acne. Journal of Investigative Dermatology (in press).-   5. Chambers H F, Deleo F R. 2009. Waves of resistance:    Staphylococcus aureus in the antibiotic era. Nat Rev Microbiol    7:629-641.-   6. Chase-Topping M, Gaily D, Low C, Matthews L, Woolhouse M. 2008.    Super-shedding and the link between human infection and livestock    carriage of Escherichia coli 0157. Nat Rev Microbiol 6:904-912.-   7. Johnson J L, Cummins C S. 1972. Cell wall composition and    deoxyribonucleic acid similarities among the anaerobic coryneforms,    classical propionibacteria, and strains of Arachnia propionica. J    Bacteriol 109:1047-1066.-   8. McDowell A, Valanne S, Ramage G, Tunney M M, Glenn J V, McLorinan    G C, Bhatia A, Maisonneuve J F, Lodes M, Persing D H, Patrick S.    2005. Propionibacterium acnes types I and II represent    phylogenetically distinct groups. J Clin Microbiol 43:326-334.-   9. Valanne S, McDowell A, Ramage G, Tunney M M, Einarsson G G,    O'Hagan S, Wisdom G B, Fairley D, Bhatia A, Maisonneuve J F, Lodes    M, Persing D H, Patrick S. 2005. CAMP factor homologues in    Propionibacterium acnes: a new protein family differentially    expressed by types I and II. Microbiology 151:1369-1379.-   10. McDowell A, Perry A L, Lambert P A, Patrick S. 2008. A new    phylogenetic group of Propionibacterium acnes. J Med Microbiol    57:218-224.-   11. Lomholt H B, Kilian M. 2010. Population genetic analysis of    Propionibacterium acnes identifies a subpopulation and epidemic    clones associated with acne. PLoS One 5:e12277.-   12. McDowell A, Gao A, Barnard E, Fink C, Murray P I, Dowson C G,    Nagy I, Lambert P A, Patrick S. 2011. A novel multilocus sequence    typing scheme for the opportunistic pathogen Propionibacterium acnes    and characterization of type I cell surface-associated antigens.    Microbiology 157:1990-2003.-   13. McDowell A, Barnard E, Nagy I, Gao A, Tomida S, Li H, Eady A,    Cove J, Nord C E, Patrick S. 2012. An Expanded Multilocus Sequence    Typing Scheme for Propionibacterium acnes: Investigation of    ‘Pathogenic’, ‘Commensal’ and Antibiotic Resistant Strains. PLoS One    7:e41480.-   14. Bruggemann H, Henne A, Hoster F, Liesegang H, Wiezer A,    Strittmatter A, Hujer S, Durre P, Gottschalk G. 2004. The complete    genome sequence of Propionibacterium acnes, a commensal of human    skin. Science 305:671-673.-   15. The Human Microbiome Project Consortium. 2012. A framework for    human microbiome research. Nature 486:215-221.-   16. The Human Microbiome Project Consortium. 2012. Structure,    function and diversity of the healthy human microbiome. Nature    486:207-214.-   17. Nelson K E, Weinstock G M, Highlander S K, Worley K C, Creasy H    H, Wortman J R, Rusch D B, Mitreva M, Sodergren E, Chinwalla A T,    Feldgarden M, Gevers D, Haas B J, Madupu R, Ward D V, Birren B W,    Gibbs R A, Methe B, Petrosino J F, Strausberg R L, Sutton G G, White    O R, Wilson R K, Durkin S, Giglio M G, Gujja S, Howarth C, Kodira C    D, Kyrpides N, Mehta T, Muzny D M, Pearson M, Pepin K, Pati A, Qin    X, Yandava C, Zeng Q, Zhang L, Berlin A M, Chen L, Hepburn T A,    Johnson J, McCorrison J, Miller J, Minx P, Nusbaum C, Russ C, Sykes    S M, Tomlinson C M, Young S, Warren W C, Badger J, Crabtree J,    Markowitz V M, Orvis J, Cree A, Ferriera S, Fulton L L, Fulton R S,    Gillis M, Hemphill L D, Joshi V, Kovar C, Torralba M, Wetterstrand K    A, Abouellleil A, Wollam A M, Buhay C J, Ding Y, Dugan S, FitzGerald    M G, Holder M, Hostetler J, Clifton S W, Allen-Vercoe E, Earl A M,    Farmer C N, Liolios K, Surette M G, Xu Q, Pohl C, Wilczek-Boney K,    Zhu D. 2010. A catalog of reference genomes from the human    microbiome. Science 328:994-999.-   18. Brzuszkiewicz E, Weiner J, Wollherr A, Thurmer A, Hupeden J,    Lomholt H B, Kilian M, Gottschalk G, Daniel R, Mollenkopf H J, Meyer    T F, Bruggemann H. 2011. Comparative genomics and transcriptomics of    Propionibacterium acnes. PLoS One 6:e21581.-   19. Hunyadkurti J, Feltoti Z, Horvath B, Nagymihaly M, Voros A,    McDowell A, Patrick S, Urban E, Nagy I. 2011. Complete genome    sequence of Propionibacterium acnes type IB strain 6609. J Bacteriol    193:4561-4562.-   20. Horvath B, Hunyadkurti J, Voros A, Fekete C, Urban E, Kemeny L,    Nagy I. 2012. Genome sequence of Propionibacterium acnes type II    strain ATCC 11828. J Bacteriol 194:202-203.-   21. Voros A, Horvath B, Hunyadkurti J, McDowell A, Barnard E,    Patrick S, Nagy I. 2012. Complete genome sequences of three    Propionibacterium acnes isolates from the type IA(2) cluster. J    Bacteriol 194:1621-1622.-   22. McDowell A, Hunyadkurti J, Horvath B, Voros A, Barnard E,    Patrick S, Nagy I. 2012. Draft genome sequence of an    antibiotic-resistant Propionibacterium acnes strain, PRP-38, from    the novel type IC cluster. J Bacteriol 194:3260-3261.-   23. Niazi S A, Clarke D, Do T, Gilbert S C, Mannocci F, Beighton D.    2010. Propionibacterium acnes and Staphylococcus epidermidis    isolated from refractory endodontic lesions are opportunistic    pathogens. J Clin Microbiol 48:3859-3869.-   24. Tettelin H, Riley D, Cattuto C, Medini D. 2008. Comparative    genomics: the bacterial pan-genome. Curr Opin Microbiol 11:472-477.-   25. Schloissnig S, Arumugam M, Sunagawa S, Mitreva M, Tap J, Zhu A,    Waller A, Mende D R, Kultima J R, Martin J, Kota K, Sunyaev S R,    Weinstock G M, Bork P. 2013. Genomic variation landscape of the    human gut microbiome. Nature 493:45-50.-   26. Ross J I, Snelling A M, Carnegie E, Coates P, Cunliffe W J,    Bettoli V, Tosti G, Katsambas A, Galvan Perez Del Pulgar J I,    Rollman O, Torok L, Eady E A, Cove J H. 2003. Antibiotic-resistant    acne: lessons from Europe. The British journal of dermatology    148:467-478.-   27. Jackson A P, Thomas G H, Parkhill J, Thomson N R. 2009.    Evolutionary diversification of an ancient gene family (rhs) through    C-terminal displacement. BMC Genomics 10:584.-   28. Horvath P, Barrangou R. 2010. CRISPR/Cas, the immune system of    bacteria and archaea. Science 327:167-170.-   29. Harris S R, Feil E J, Holden M T, Quail M A, Nickerson E K,    Chantratita N, Gardete S, Tavares A, Day N, Lindsay J A, Edgeworth J    D, de Lencastre H, Parkhill J, Peacock S J, Bentley S D. 2010.    Evolution of MRSA during hospital transmission and intercontinental    spread. Science 327:469-474.-   30. Croucher N J, Harris S R, Fraser C, Quail M A, Burton J, van der    Linden M, McGee L, von Gottberg A, Song J H, Ko K S, Pichon B, Baker    S, Parry C M, Lambertsen L M, Shahinas D, Pillai D R, Mitchell T J,    Dougan G, Tomasz A, Klugman K P, Parkhill J, Hanage W P, Bentley    S D. 2011. Rapid pneumococcal evolution in response to clinical    interventions. Science 331:430-434.-   31. Chin C S, Sorenson J, Harris J B, Robins W P, Charles R C,    Jean-Charles R R, Bullard J, Webster D R, Kasarskis A, Peluso P,    Paxinos E E, Yamaichi Y, Calderwood S B, Mekalanos J J, Schadt E E,    Weldor M K. 2011. The origin of the Haitian cholera outbreak strain.    The New England journal of medicine 364:33-42.-   32. Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for    sequence finishing. Genome Res 8:195-202.-   33. Zerbino D R, Birney E. 2008. Velvet: algorithms for de novo    short read assembly using de Bruijn graphs. Genome Res 18:821-829.-   34. Borodovsky M, Mclninch J. 1993. Recognition of genes in DNA    sequence with ambiguities. Biosystems 30:161-171.-   35. Salzberg S L, Delcher A L, Kasif S, White O. 1998. Microbial    gene identification using interpolated Markov models. Nucleic Acids    Res 26:544-548.-   36. Kurtz S, Phillippy A, Delcher A L, Smoot M, Shumway M, Antonescu    C, Salzberg S L. 2004. Versatile and open software for comparing    large genomes. Genome Biol 5:R12.-   37. Tamura K, Peterson D, Peterson N, Stecher G, Nei M,    Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using    maximum likelihood, evolutionary distance, and maximum parsimony    methods. Mol Biol Evol 28:2731-2739.-   38. Grissa I, Vergnaud G, Pourcel C. 2007. CRISPRFinder: a web tool    to identify clustered regularly interspaced short palindromic    repeats. Nucleic Acids Res 35:W52-57.-   39. Altschul S F, Gish W, Miller W, Myers E W, Lipman D J. 1990.    Basic local alignment search tool. J Mol Biol 215:403-410.-   40. de Hoon M J, Imoto S, Nolan J, Miyano S. 2004. Open source    clustering software. Bioinformatics 20:1453-1454.-   41. Saldanha A J. 2004. Java Treeview—extensible visualization of    microarray data. Bioinformatics 20:3246-3248.

Example 3—Microbial DNA Extraction from Skin Samples Skin MicrocomedoneSampling

Skin microcomedone (white head or black head) samples were taken fromthe skin of the subjects using a specialized adhesive tape. The skin wasmoistened with water before the adhesive tape was put on. The tape wasleft on the skin for 15-20 minutes until it became dry. Clean gloveswere used for each sampling. After being taken off from the skin, thetape was placed into a 50 mL sterile tube. This can be applied to manyskin sites, such as the nose, forehead, chin, and back.

Bacterial DNA Extraction

Microcomedones were individually picked or scraped off from the adhesivetape using sterile forceps and placed in a 2 mL sterile microcentrifugetube filled with Buffer ATL (Qiagen) and 0.1 mm diameter glass beads(BioSpec Products, Inc., Bartlesville, Okla.). Cells were lysed using abeadbeater for 3 minutes at 4,800 rpm at room temperature. Aftercentrifugation at 14,000 rpm for 5 minutes, the supernatant wasretrieved and used for genomic DNA extraction using QIAamp DNA Micro Kit(Qiagen). The manufacturer protocol for extracting DNA from chewing gumwas used. Concentration of the genomic DNA was determined by aspectrometer.

Example 4—Microbiome Type Detection

Detailed Protocol for Accurate Detection of the Skin Microbiome TypeBased on 16S rDNAPCR amplification, cloning, and sequencing

16S rDNA was amplified using primers 8F (5′-AGAGTTTGATYMTGGCTCAG-3′) and1510R (5′-TACGGYTACCTTGTTACGACTT-3′). Thermocycling conditions were asfollowing: initial denaturation step of 5 minutes at 94° C., 30 cyclesof denaturation at 94° C. for 45 seconds, annealing at 52° C. for 30seconds and elongation at 72° C. for 90 seconds, and a final elongationstep at 72° C. for 20 minutes. PCR products were purified usingcolumn-based method. Subsequently, the 16S rDNA amplicons were clonedinto pCR 2.1-TOPO vector (Invitrogen). One-Shot TOP-10 ChemicallyCompetent E. coli cells (Invitrogen) were transformed with the vectorsand plated on selective media. Individual positive colonies were pickedand inoculated into selective LB liquid medium. After 14 hours ofincubation, the plasmids were extracted and purified, either usingcolumn-based plasmid extraction kit or traditional methods. The cloneswere sequenced bidirectionally using Sanger sequencing method. Themicrobiome type of each individual was determined based on the 16S rDNAsequence data of the top 10 major ribotypes. See FIG. 5 and SEQ ID NOs1-10.

Detailed protocols for fast detection of the skin microbiome type basedon PCR and qPCR

By sequencing and annotating 69 novel P. acnes genomes and by comparinga total of 82 P. acnes genomes, several genomic loci which are unique toacne associated P. acnes strains were identified, i.e., Loci 1-4. SeeFIG. 8. The genomic sequences of Loci 1, 2, 3, and 4 from all sequencesP. acnes strains and their sequence similarities, which range from 95%to 100%, are listed as SEQ ID NOs 15-18, respectively.

Detection Method 1

To rapidly detect the presence or absence of RT4 and RT5 strains inpatients, multiplex PCR targeting Loci 1, 2, and 3 was designed andperformed on genomic DNA extracted from P. acnes strains and skinsamples. FIG. 23 shows that Loci 1, 2, and 3 were amplified from variousP. acnes strains as predicted based on the genome data.

The PCR primer sequences are shown in Table 1:

TABLE 1 Primers specific for loci 1, 2, 3 and Pak (housekeeping gene)Target locus Forward primer Reverse primer Locus 1 GGTATCCACCGAGATGGAAGGTGGTCCCAGGTGACATTCT (SEQ ID NO: 11) (SEQ ID NO: 12) Locus 2CGACATCGACGTTTCATCTG GTGTTCTCCTCGTGCTGGTT (SEQ ID NO: 13)(SEQ ID NO: 14) Locus 3 GATAATCCGTTCGACAAGCTG ACCCACCACGATGATGTTT(SEQ ID NO: 15) (SEQ ID NO: 16) Pak CGACGCCTCCAATAACTTCCGTCGGCCTCCTCAGCATCAdditional primers targeting these loci can be designed based on thegenome sequences of loci 1-4 (SEQ ID NOs 15-18, respectively). Each 20μL reaction contains 12.7 μL molecular grade H2O, 2 μL 10× High FidelityBuffer, 0.6 μL 50 mM MgSO4, 0.4 μL 10 nM dNTP, 0.8 μL of each primer(final primer concentrations is 400 nM), 0.1 μL Platinum Taq DNAPolymerase High Fidelity (All reagents from Invitrogen) and 1 μL gDNAtemplate (approx. 40 ng gDNA). The thermocycling conditions are asfollowing: initial denaturation step of 10 minutes at 95° C., 35 cycleswith each consisting of 45 seconds at 95° C., 30 seconds at 65° C. and45 seconds at 72° C.; and final elongation step of 10 minutes at 72° C.

To quantitatively measure the abundance of acne associated P. acnesstrains in skin samples, quantitative PCR (qPCR) targeting Loci 1, 2,and 3 were performed on genomic DNA extracted from P. acnes strains. SeeFIG. 12. LightCyler 480 High Resolution Melting Master kit was used(Roche Diagnostics GmbH, Mannheim, Germany). Each 10 μL reactioncontains 5 μL of Master Mix (2× concentrate), 1 μL of 25 mM MgCl2, 0.5μL of 4 μM forward and reverse primers, 1 μL to 3.5 μL of DNA template(approximately 2.5 ng DNA), and molecular grade H2O, up to the volume.The thermocycling conditions were as following: initial activation stepof 10 minutes at 95° C.; 40 amplification cycles with each consisting of10 seconds at 95° C., 15 seconds at 65° C. during first cycle, but witha stepwise 0.5° C. decrease for each succeeding cycle and 30 seconds at72° C.; and finally, a melting curve step, starting at 65° C. and endingat 99° C., with ramp rate of 0.02° C./s and acquisition rate of 25 per °C.

The protocol was tested using mock samples, where different strains ofP. acnes were mixed in different proportions to mimic the strainpopulation distributions in real samples See Table 2.

TABLE 2 Mock samples mimicking the microbiome types observed in humanskin samples Micro- Percentage Abun- Abun- Mock biome Dominant ofdominant dance of dance of sample type isolate isolate Locus 1 Locus 21A I HL036PA3 80 7.5% 7.5% 1B I HL078PA1 80 7.5% 87.5% 1C I HL106PA2 8087.5% 87.5% 2 II HL103PA1 80 8.0% 10.0% 3 III HL087PA3 80 8.0% 10.0% 4IV HL038PA1 70 79.0% 82.0% 5A V HL072PA2 80 6.7% 8.9% 5B V HL096PA1 8086.7% 88.9% 6 minor HL110PA4 70 12.0% 15.0% 8A minor HL086PA1 70 80.0%83.3% 8B minor HL092PA1 70 10.0% 13.3%

The concentration of each locus was quantified from standards derivedfrom Locus 1, Locus 2, and Pak PCR amplicons. The copy number of eachgene was quantified from genomic DNA standards that were derived fromTadA (in Locus 3) and Pak amplicons using conventional PCR. See FIG. 24.

Detection Method 2

P. acnes TaqMan qPCR Assay

Primer and Probe Design

Primers and probes for detecting Loci 1, 2, 3, and 4 in P. acnes strainsand clinical samples were designed as listed in Table 3:

A triplex Taqman qPCR was designed and tested using Propionibacteriumspecific primers to target Locus 1, Locus 3, and an internal control,Pak, present in all P. acnes.

TABLE 3Primer and prode sequences used for identification of P. acnes lociPrimer/ Targeted Primer/ Probe Amplicon region Probe nameSequence (5′-3′) size (bp) Locus1 forward Locus1_F GAAGAATCCCGCTCCATTTCC107 primer (SEQ ID NO: 17) reverse Locus1_R CCTTTCTTGTAGCCGAGCAG primer(SEQ ID NO: 18) probe Locus1_56-FAM/ATTGTCACC/ZEN/TGGGACCACCGTAAAC/3IABkFQ Probe (SEQ ID NO: 19)Locus2 forward Locus2_F CGTGATCCTGATCGACTGTG 103 primer (SEQ ID NO: 20)reverse Locus2_R GCTCCACAACTTCGAGTGC primer (SEQ ID NO: 21) probeLocus2_ CAGGCCGTTGATCGTGAGCTGA Probe (SEQ ID NO: 22) Locus3 forwardLocus3_F TGCTGATAATCCGTTCGACA 104 primer (SEQ ID NO: 23) reverseLocus3_R ACGACGTCCGAAAACAACTCC primer (SEQ ID NO: 24) probe Locus3_5TET/CTCTACCGA/ZEN/AGCTCTTGCCGCAT/3IABkFQ Probe (SEQ ID NO: 25) Locus4forward Locus4_F ATCGCCGTCGACAGGTAGT 103 primer (SEQ ID NO: 26) reverseLocus4_R CCGAGATTCTGCGCCTAGT primer (SEQ ID NO: 27) probe Locus4_CGGTGCCCTTGCTGAGGTACA Probe (SEQ ID NO: 28) Pak forward Pak_FGCAACCCGACATCCTCATTA 101 primer reverse Pak_R AGTCGAAGAAGTCGCTCAGGprimer probe Pak_ VIC/CGTTCTACAGCCACCACGACGG/TAMRA Probe

Locus 1, Locus 3, and Pak Amplification

Benchtop amplification was carried out to assess specificity of designedprimers and to determine optimum cycling conditions prior tomultiplexing. Amplification was carried out using a BioRad C1000 thermalcycler. Singleplex PCR reactions contained 0.2 μM target specificprimers, 10× Platinum Taq buffer (Invitrogen), 1.0 mM MgCl2, 0.2 mM eachdNPT, 0.5 U/μl Platinum Taq polymerase, 1 μl DNA template and made up toa final volume of 10 μl. Cycling was as follows: initial denaturation94° C. for 5 minutes, followed by 30 cycles of denaturation at 94° C.for 30 seconds, annealing at 60° C. for 30 seconds and extension at 72°C. for 30 seconds, and one final extension cycle at 72° C. for 5minutes. Amplification products were analysed electrophoretically on a1% agarose/TBE gel to check for correct amplification of target andcross-species reactivity with primer targets.

Taqman Triplex PCR

Triplex qPCR was carried out using an Applied Biosystems 7900HTinstrument. 1-2 μl of sample DNA were added to mastermix containing X2QuantiTect Multiplex PCR Master Mix (Qiagen), 0.2 μM primers; Locus1_F,Locus1_R, Pak_F, Pak_R; 0.2 μM probes; Locus1_Probe and Pak_Probe, and0.1 μM primers Locus3_F and Locus3_R, and 0.1 μM Locus3_Probe. Thereaction mix was made up to a final volume of 20 μl with sterile PCRgrade water. The PCR program consisted of one cycle at 50° C. for 2minutes, followed by one cycle at 95° C. for 15 minutes to allow foractivation of the multiplex mastermix, then 45 cycles of 94° C. for 60seconds and 57° C. for 90 seconds. Each run contained calibrators ofextracted P. acnes DNA from culture, as well as no-template controls(NTC) and water controls. qPCR was run with a passive reference, ROX,supplied in the Quantitect Multiplex PCR mastermix. Data were analysedusing the SDS v2.4 software.

Assay Calibration, Sensitivity, and Specificity

Calibrations curves for P. acnes targets Locus 1 and Pak wereconstructed by plotting mean Ct values for a series of log dilutions ofquantified genomic DNA standards extracted from P. acnes from pureculture. Genome equivalents were estimated. Five replicates of P. acnescalibrators were used to calculate mean Ct values and standarddeviations. These data were used to determine sensitivity of the assayand the limits of detection (LOD). Calibration plots were used todetermine the number of P. acnes genomes in clinical samples, with onecopy of Locus1 and Pak targets per genome. DNA concentration and copynumber were determined and serial ten-fold dilutions of the purifiedproduct were used as standards for construction of the Locus3calibration plot. Strains that display possible combinations of thepresence and absence of Locus1 and Locus 3 were used for Locus1 and Pakcalibration: HL038PA1 (Locus1+, Locus3+), HL083PA1 (Locus1+, Locus3−),HL078PA1 (Locus1−, Locus3+) and HL063PA1 (Locus1−, Locus3−). The assaywas validated using sequenced P. acnes strains from pure culture withknown loci before being applied to clinical samples. The specificity ofthe assay for each target was tested using other bacterial speciesincluding skin commensals and other Propionibacteria.

Assay Validation

A total of 24 sequenced P. acnes strains (HL063PA1, HL078PA1, HL083PA1,HL038PA1, HL037PA1, HL082PA1, HL020PA1, HL001PA1, HL046PA2, HL043PA1,HL086PA1, HL110PA3, HL110PA4, HL007PA1, HL087PA3, HL027PA1, HL056PA1,HL067PA1, HL074PA1, HL045PA1, HL053PA1, HL005PA1, HL072PA1, HL043PA2)including possible combinations with and without Locus 1 and Locus 3were used to validate the triplex qPCR assays. The qPCR triplex assaysuccessfully identified Locus 1 and Locus 3 in strains previously shownby whole genome sequencing to harbor these loci.

Application to Clinical Samples

Genomic DNA extracted from two clinical samples, #1 and #2, wereanalyzed using the Taqman qPCR triplex assay. Amplification plotsrevealed the presence of P. acnes (Pak) in both samples (FIG. 6). Locus1and Locus 3 targets were also detected in both samples with a muchlarger percentage of P. acnes Locus1 positive and Locus3 positivestrains present in 1 μl of sample #1 compared to sample #2.

Example 5—Acne Vaccine

Strains with 16S rDNA ribotypes (RTs) 4, 5, 7, 8, 9, and 10 wereidentified as highly associated with acne. Vaccines can be raisedagainst these strains. See T. Nakatsuji et al., 128(10) J. Invest.Dermatol. 2451-2457 (October 2008).

Example 6—Probiotic Development Utilizing the Strains Associated withHealthy Skin in Topical Creams, Solutions, and the Like for Cosmetic andOther Products

RT6 is mostly found in healthy skin. These strains can be used asprobiotics in topical products for acne prevention and treatment. FourRT6 strains, including HL110PA3, HL110PA4, HL042PA3, and HL202PA1, wereisolated and sequenced.

In addition, bacterial culture supernatant and/or cell lysate, includingbacterial metabolites, can be used in creams, solutions, and othercosmetic products to prevent the growth of strains associated with acne.Sequences sharing at least 95% homology with SEQ ID NOs 51-54 may beused for the development of probiotics and the like.

Example 7—Drug Development Targeting Specific Strains that areAssociated with Acne

Identification of the Core and Non-Core Regions of P. acnes

The “core” genome regions of P. acnes were defined as genome sequencesthat are present in all of the 82 genomes, while the “non-core” regionswere defined as genome sequences that are NOT present in all thegenomes. See S. Tomida et al., Pan-genome and Comparative GenomeAnalyses of Propionibacterium acnes Reveal Its Genomic Diversity in theHealthy and Diseased Human Skin Microbiome (in press); see also Example2. Non-core regions specific to strains of RTs 4 and 5, e.g., loci 1, 2,and 3, were identified, as mentioned previously. Non-core regionsspecific to strains of RT8 (noted as Locus 4) were also identified aswell as several other strains such as HL078PA1, HL030PA2, HL063PA2,P.acn17, HL097PA1, and PRP38. See FIG. 20. The genomic sequence of Locus4 is set forth as SEQ ID NO:18. The genes in loci 1-4 below (Tables 4-1,4-2, and 4-3) that are mostly unique to acne associated strains RT4,RT5, and RT8 are listed below. Non-core sequences are also set forth.The genes encoded in these loci are drug targets.

TABLE 4-1 List of genes encoded in loci 1 and 2, specific to RT4 and 5Locus in FIG. 3a ID Description Locus 1 GM131 ABC transporterATP-binding protein Locus 1 GM132/ Site-specific recombinase GM133*²Locus 1 GM134 Site-specific recombinase Locus 1 GM135 Hypotheticalprotein Locus 1 GM136 Hypothetical protein Locus 1 GM137N-acetylmuramoyl-L-alanine amidase Locus 2 GM171 Hypothetical proteinLocus 2 GM172 Hypothetical protein Locus 2 GM173 Single-strand bindingfamily protein Locus 2 GM174 CobQ/CobB/MinD/ParA nucleotide bindingdomain protein Locus 2 GM175 Hypothetical protein Locus 2 GM176Hypothetical protein Locus 2 GM177 Hypothetical protein Locus 2 GM178Hypothetical protein Locus 2 GM179 Hypothetical protein Locus 2 GM180Hypothetical protein Locus 2 GM181 CAAX amino protease family proteinLocus 2 GM182 Hypothetical protein Locus 2 GM183 YcaO-like protein Locus2 GM184 Hypothetical protein Locus 2 GM185 SagB-type dehydrogenasedomain protein Locus 2 GM186 Hypothetical protein Locus 2 GM187 ABCtransporter, ATP-binding protein Locus 2 GM188 ABC-2 type transporterLocus 2 GM189 Hypothetical protein Locus 2 GM196 Hypothetical protein

TABLE 4-2 List of genes encoded in Locus 3, a linear plasmid andspecific to RT4 and 5 Locus ID Description Locus 3 PAGK_2319hypothetical protein Locus 3 PAGK_2320 hypothetical protein Locus 3PAGK_2321 hypothetical protein Locus 3 PAGK_2322 plasmid stabilizationsystem protein Locus 3 PAGK_2323 hypothetical protein Locus 3 PAGK_2324hypothetical protein Locus 3 PAGK_2325 hypothetical protein Locus 3PAGK_2326 CobQ/CobB/MinD/ParA nuclcotide binding domain Locus 3PAGK_2327 hypothetical protein Locus 3 PAGK_2328 hypothetical proteinLocus 3 PAGK_2329 hypothetical protein Locus 3 PAGK_2330 hypotheticalprotein Locus 3 PAGK_2331 hypothetical protein (similar to PPA1279)Locus 3 PAGK_2332 plasmid partition protein ParA Locus 3 PAGK_2333hypothetical protein Locus 3 PAGK_2334 hypothetical protein Locus 3PAGK_2335 hypothetical protein Locus 3 PAGK_2336 putativeribbon-helix-helix protein, copG family Locus 3 PAGK_2337 putativeribonuclease E Locus 3 PAGK_2338 hypothetical protein (similar toPPA1294) Locus 3 PAGK_2339 hypothetical protein (similar to PPA1295)Locus 3 PAGK_2340 putative permease Locus 3 PAGK_2341 hypotheticalprotein (similar to PPA1297) Locus 3 PAGK_2342 hypothetical protein(similar to PPA1298) Locus 3 PAGK_2343 hypothetical protein (similar toPPA1299) Locus 3 PAGK_2344 hypothetical protein (similar toCLOLEP_00122) Locus 3 PAGK_2345 hypothetical protein (similar toCLOLEP_00123) Locus 3 PAGK_2346 hypothetical protein (similar toCLOLEP_00124) Locus 3 PAGK_2347 hypothetical protein (similar toCLOLEP_00125) Locus 3 PAGK_2348 hypothetical protein (similar toCLOLEP_00126) Locus 3 PAGK_2349 hypothetical protein (similar toCLOLEP_00127) Locus 3 PAGK_2350 hypothetical protein Locus 3 PAGK_2351hypothetical protein (similar to CLOLEP_00129) Locus 3 PAGK_2352hypothetical protein (similar to CLOLEP_00130) Locus 3 PAGK_2353hypothetical protein (similar to CLOLEP_00131) Locus 3 PAGK_2354hypothetical protein (similar to CLOLEP_00132) Locus 3 PAGK_2355hypothetical protein (similar to CLOLEP_00134) Locus 3 PAGK_2356hypothetical protein (similar to CLOLEP_00135) Locus 3 PAGK_2357hypothetical protein (similar to CLOLEP_00141) Locus 3 PAGK_2358hypothetical protein (similar to CLOLEP_00142) Locus 3 PAGK_2359hypothetical protein (similar to CLOLEP_00143) Locus 3 PAGK_2360hypothetical protein (similar to CLOLEP_00144, RcpC) Locus 3 PAGK_2361hypothetical protein (similar to CLOLEP_00145, TadZ) Locus 3 PAGK_2362hypothetical protein (similar to CLOLEP_00146, TadA) Locus 3 PAGK_2363hypothetical protein (similar to CLOLEP_00147, TadB) Locus 3 PAGK_2364hypothetical protein (similar to CLOLEP_00148, TadC) Locus 3 PAGK_2365hypothetical protein (similar to CLOLEP_00149, Flp-1) Locus 3 PAGK_2366hypothetical protein (similar to CLOLEP_00151, TadE) Locus 3 PAGK_2367hypothetical protein (similar to CLOLEP_00152, TadE) Locus 3 PAGK_2368hypothetical protein (similar to CLOLEP_00153, TadE) Locus 3 PAGK_2369hypothetical protein (similar to CLOLEP_00154) Locus 3 PAGK_2370hypothetical protein (similar to CLOLEP_00157) Locus 3 PAGK_2371hypothetical protein (similar to CLOLEP_00158) Locus 3 PAGK_2372hypothetical protein (similar to CLOLEP_00159) Locus 3 PAGK_2373hypothetical protein (similar to CLOLEP_00160) Locus 3 PAGK_2374hypothetical protein Locus 3 PAGK_2375 hypothetical protein (similar toCLOLEP_00162) Locus 3 PAGK_2376 hypothetical protein (similar toCLOLEP_00163) Locus 3 PAGK_2377 hypothetical protein (similar toCLOLEP_00164) Locus 3 PAGK_2378 hypothetical protein (similar toCLOLEP_00166) Locus 3 PAGK_2379 repA Locus 3 PAGK_2380CobQ/CobB/MinD/ParA nuclsotide binding domain Locus 3 PAGK_2381hypothetical protein Locus 3 PAGK_2382 hypothetical protein Locus 3PAGK_2383 Yag1E Locus 3 PAGK_2384 hypothetical protein Locus 3 PAGK_2385hypothetical protein Locus 3 PAGK_2386 hypothetical protein Locus 3PAGK_2387 hypothetical protein Locus 3 PAGK_2388 hypothetical proteinLocus 3 PAGK_2389 hypothetical protein Locus 3 PAGK_2390 hypotheticalprotein Locus 3 PAGK_2391 hypothetical protein Locus 3 PAGK_2392 ResA

TABLE 4-3 List of genes encoded in Locus 4, RT8 specific region Locus IDDescription Locus 4 HMPREF9576_00292 tRNA adenylyltransferase Locus 4HMPREF9576_00293 conserved hypothetical protein Locus 4 HMPREF9576_00294conserved domain protein Locus 4 HMPREF9576_00295 response regulatorreceiver domain protein Locus 4 HMPREF9576_00296 histidine kinase Locus4 HMPREF9576_00297 hypothetical protein Locus 4 HMPREF9576_00298hypothetical protein Locus 4 HMPREF9576_00299 hypothetical protein Locus4 HMPREF9576_00300 hypothetical protein Locus 4 HMPREF9576_00301hypothetical protein Locus 4 HMPREF9576_00302 drug resistance MFStransporter, drug: H+ antiporter-2 (14 Spanner) (DHA2) family proteinLocus 4 HMPREF9576_00303 hypothetical protein Locus 4 HMPREF9576_00304conserved domain protein Locus 4 HMPREF9576_00305 beta-ketoacylsynthase, N-terminal domain protein Locus 4 HMPREF9576_00306hypothetical protein Locus 4 HMPREF9576_00307 acetyltransferase, GNATfamily Locus 4 HMPREF9576_00308 putative (3R)-hydroxymyristoyl-ACPdehydratase Locus 4 HMPREF9576_00309 putative acyl carrier protein Locus4 HMPREF9576_00310 putative 3-ketoacyl-(acyl-carrier-protein) reductaseLocus 4 HMPREF9576_00311 ornithine cyclodeaminase/mu-crystallin familyprotein Locus 4 HMPREF9576_00312 pyridoxal-phosphate dependent enzymeLocus 4 HMPREF9576_00313 lantibiotic dehydratase, C-terminus Locus 4HMPREF9576_00314 aminotransferase, class I/II Locus 4 HMPREF9576_00315acyl carrier domain protein Locus 4 HMPREF9576_00316 AMP-binding enzymeLocus 4 HMPREF9576_00317 malonyl CoA-acyl carrier protein transacylasefamily protein Locus 4 HMPREF9576_00318 ABC-2 type transporter Locus 4HMPREF9576_00319 ABC transporter, ATP-binding protein Locus 4HMPREF9576_00320 hypothetical protein

Example 8—Targeted Phage Therapy

Bacteriophages play an important role in regulating the composition anddynamics of microbial communities, including the human skin microbiota.Bacteriophages of Propionibacterium acnes, a major skin commensal, werepreviously isolated and used as a typing system to distinguish differentserotypes of P. acnes. However, molecular characterization of thesephages had been lacking. Recent efforts in genome sequencing haveimproved our understanding of P. acnes phages and their interactionswith bacterial hosts.

Bacteriophages are the most abundant organisms on earth (Mc Grath & vanSinderen, 2007) and are believed to outnumber bacteria by 10:1 in manydiverse ecosystems (Rohwer, 2003). As important components of microbialcommunities, bacteriophages are a reservoir of diversity-generatingelements (Rohwer & Thurber, 2009) and regulate both the abundances(Suttle, Chan, & Cottrell, 1990) and diversity of microbial hosts bypredation (Rodriguez-Valera et al., 2009). The human skin is inhabitedby hundreds of microbial species, including bacteria, fungi, and viruses(Grice & Segre, 2011). The homeostasis of this ecosystem is important toits function as a barrier against the invasion and colonization ofpathogens on the skin. However, much remains to be learned about thenature and driving forces of the dynamics among the microorganisms inthe skin microbial community. In particular, the relative abundances andinteractions between bacteriophages and their bacterial hosts on theskin remained to be elucidated.

The microbial community in the pilosebaceous unit of the skin isdominated by Propionibacterium acnes, which accounts for approximately90% of the microbiota (“(Nature Precedings Paper),” n.d.). P. acnes hasbeen suggested as a pathogenic factor in the development of acnevulgaris (Bojar & Holland, 2004; Leyden, 2001), one of the most commonhuman skin diseases. Above-detailed studies classified P. acnes strainsinto ribotypes (RT) based on their 16S ribosomal RNA (rRNA) sequences,and demonstrated that P. acnes strain population structure inpilosebaceous units differs between healthy skin and acne affected skin.

P. acnes bacteriophages exist on the human skin. In 1968, Zierdt et al.(Zierdt, Webster, & Rude, 1968) isolated such a phage, named phage 174,from spontaneous plaques of a P. acnes isolate (at the time known asCorynebacterium acnes). Phage 174 was able to lyse nearly all P. acnesstrains tested in the study [10]. Subsequently, more P. acnes phageswere isolated which exhibited varied life cycles that range from lyticto temperate [11, 12]. However, in the last decades, the study of P.acnes bacteriophages had been limited to the development of phage typingsystems to distinguish the different serotypes of P. acnes [13,14]_ENREF_4, and extensive molecular characterization of the phages hasbeen lacking.

Recent genomic sequencing of P. acnes bacteriophages (Farrar et al.,2007; Lood & Collin, 2011; Marinelli et al., 2012) have provided newinsight into P. acnes phage diversity. P. acnes phages are similar tomycobacteriophages both morphologically and genetically, but have a muchsmaller genome. Currently 14 phage genome sequences are available.Sequencing additional phage isolates is needed to further characterizethe diversity. Despite these recent sequencing efforts, the genome-leveldiversity of P. acnes phages in the human skin microbiome and theirinteractions with P. acnes and other Propionibacteria remain to beelucidated. P. acnes phages have diverse host specificities amongdifferent lineages of P. acnes strains [14]. Phage host specificity isimportant in determining how these phages regulate the composition anddynamics of P. acnes populations in the community. On the other hand,certain P. acnes strains may also influence phage populations throughtheir anti-viral mechanisms, such as the bacterial immune system basedon the transcription of clustered, regularly-interspaced short,palindromic repeat (CRISPR) sequence arrays. The CRISPR arrays containoligonucleotide ‘spacers’ derived from phage DNA or plasmid DNA. In amanner analogous to RNA interference, the transcribed, single-strandedCRISPR RNA elements interact with CRISPR-associated (Cas) proteins todirect the degradation of DNA targets containing complementary‘protospacer’ sequences from foreign DNA [16]. While characterizing thegenome diversity of P. acnes, Applicants discovered that P. acnesstrains of RT2 and RT6 harbor CRISPR arrays. The CRISPR mechanism mayplay a role in defending against phage or plasmid invasion.

To better understand the interactions between bacteria andbacteriophages in the human skin microbiome and their contributions toskin health and disease, the diversity and host specificity of P. acnesphages isolated from acne patients and healthy individuals wasinvestigated. The genomes of 15 phage isolates were investigated andscreened against a panel of 69 sequenced Propionibacteria strains todetermine their host range and specificity.

Phage Isolation and General Genome Features

To characterize the genetic diversity and the abundance of P. acnesphages in the skin microbiome, 203 skin samples of pilosebaceous unitsfrom 179 individuals were collected, including 109 samples from normalindividuals and 94 from acne patients. All of the samples were culturedfor P. acnes under anaerobic conditions. Phage plaques in 49 sampleswere observed: 35 from normal individuals and 14 from acne patients. P.acnes phages were found more frequently in samples from normalindividuals than from acne patients with statistical significance(p=0.005, Fisher's exact test). Among the 93 phage isolates that wereobtained from these samples, five phages from acne patients and ten fromnormal individuals were selected for whole genome sequencing using 454or Illumina platforms (Table 3-1).

TABLE 3-1 Phage Genome Information and Sequencing Statistics GenomePresence of Length Total Input Total Input Annotated 11-nt Phage Name(bp) GC % Reads Bases Coverage ORFs Overhang PHL111M01 29,140 54.335,453 2,865,116  98x 46 yes PHL060L00 29,514 54.05 10,000 12,015,904407x 47 yes PHL112N00 29,266 54.48 10,000 12,270,460 419x 47 yesPHL113M01 29,200 54.10 4,228 2,237,790  77x 45 yes PHL114L00 29,46454.21 10,000 12,270,609 416x 47 yes PHL010M04 29,511 53.99 3,1851,686,535  57x 46 not verified PHL066M04 29,512 53.99 4,478 2,364,379 80x 46 yes PHL073M02 29,503 53.99 4,700 2,529,669  86x 46 not verifiedPHL071N05 29,467 53.92 6,143 3,059,098 104x 46 not verified PHL067M1029,377 54.26 4,486 2,313,471  79x 46 yes PHL115M02 29,453 53.82 8,9144,687,391 159x 46 yes PHL085M01 29,451 53.82 10,000 12,552,610 426x 46yes PHL037M02 29,443 53.78 5,895 3,093,818 105x 46 yes PHL085N00* 29,45453.83 20,000 3,904,239 133 45 yes PHL082M00* 29,491 54.38 20,0003,289,741 112 44 yes Average 29,383 54.05 6,729 5,688,219 193.37 46*sequenced on the Illumina MiSeq platform. All other genomes weresequenced on the 454 platform

All phage genomes were assembled, completed, and annotated (FIG. 29).The genomes of these 15 phages have comparable sizes (29.1-29.5 Kb) andGC content (53.8-54.5%), similar to the published P. acne phage genomes.On average, 44 open reading frames (ORFs) were predicted in each genome.Consistent with the genome organization previously reported [11, 15],the ORFs were arranged compactly within the left and right arm regionsof each genome. The left arm and right arm of the genomes can bedistinguished by their opposite directions of transcription. Thesequence identity between any pair of genomes is moderately high,ranging from 78.2 to 99.9% (Table 3-1).

P. acnes Phages are Diverse with Subgroups of Highly Related Strainswith Distinct Sites of Genetic Variations

To investigate the genome diversity of P. acnes phages, all 29 sequencedphage genomes were compared, including Applicants' 15 phage genomes andthe 14 published ones (Farrar et al., 2007; Lood & Collin, 2011;Marinelli et al., 2012). The core genomic regions shared by all 29genomes have a combined length of 24,475 bp (83% of the average genomelength) and contain 6,812 single-nucleotide polymorphisms (SNPs). Aphylogenetic tree constructed from these 6,812 SNPs (FIG. 30) shows thatmost of the phage genomes isolated from all studies to date showcomparable divergence from each other with an average distance of 0.301(substitution rate at the SNP sites). However, also found were twogroups of phages, named as Group I and Group II (FIG. 30), that areclosely related with much shorter phylogenetic distances. The sameresults were obtained when the entire genome sequences (including coreand non-core regions) were used in the calculation of phylogeneticrelationships (FIG. 31).

We next determined whether the newly-sequenced phages belong to thephylogenetic groups discovered before. Lood et al. previously surveyedthe phylogenetic diversity of P. acnes phages based on the nucleotidesequences encoding head proteins or amidases of phage isolates [12].Three major phylogenetic groups were reported. Applicants' data werecombined with the data from Lood et al. and Applicants reconstructed thephylogenetic trees of head protein and amidase gene sequences. Theupdated phylogenetic trees reproduced the relationships among thestrains from the previous study (FIG. 32). However, Applicants' phageswere grouped into separate clades. Moreover, by including the genesequences from the current study, the longest phylogenetic distanceamong all studied phages was increased from 0.077 to 0.102 for the headprotein gene and from 0.140 to 0.182 for the amidase gene. Althoughthese distances are still considerably shorter than those of the closestoutgroups (0.939, head protein from mycobacteriophage Che9d, 0.764, P.acnes KPA171202 amidase) [12], the analysis suggests that P. acnes phagediversity is broader than previously described.

Some of the P. acnes phages appear to be closely related strains aspreviously shown [12]. Among the 29 sequenced genomes, two groups ofclosely related strains were observed (FIG. 30). Group I consists ofPHL066M04, PHL010M04, and PHL073M02, which are separated by an averagephylogenetic distance of 0.002 at the genome level. Group II consists ofPHL085M01, PHL085N00, PHL115M02, and PHL037M02, with an averagephylogenetic distance of 0.004. These two groups are statisticallyrobust when core regions or the entire genome or only the left-arm orrightarm coding regions were used in calculating the phylogeny (FIGS. 30and 31).

Whether the genetic variations among Group I phages or Group II phageswere located in particular regions of the genomes was investigated. Thesites of sequence variation among Group I phages lie primarily withinthe region encoding a putataive type II holin and a peptidoglycanamidase (Gp20 and Gp21 as annotated in PA6, FIG. 33). These endolysinspermeabilize the membrane and degrade the extracellular peptidoglycanlayer to release new phage particles from the bacterial host. Themajority of the sequence variations in these two genes among the Group Iphage genomes are mostly synonymous and do not appear to affect thefunctions of the proteins. The genomes of PHL010M02 and PHL066M02 differat only 11 sites, 9 of which occur in predicted coding regions.

Genetic variations among Group II members reside in a region encodinghomologs of Gp16, Gp17, and Gp18 in PA6 (FIG. 33). These proteins'location near the 3′ end of the left arm between structural proteins andlysis proteins suggests that they may be late-acting genes involved inviral protein processing and packaging.

Alternative Annotations of P. acnes Phage Genomes

The large number of newly-sequenced P. acnes phage strains provided anopportunity to validate and refine initial annotations of the P. acnesphage genomes. Based on the analysis, several alternative annotations ofthe phage genomes were confirmed.

All 15 phage genomes that were sequenced support an alternativeannotation of the Gp22/23 locus, which was previously annotated as twoORFs, Gp22 and Gp23, encoded on the plus-strand in the PA6 genome.Homologs of the PA6 genes Gp22 and Gp23 were not consistently identifiedin the genomes, as many of the homologs have inconsistent start and stopcodon positions at the expected plus-strand locations of these genes.However, on the minus-strand, all genomes appear to encode a single ORFwith a length of 513-522 bp. This annotation is consistent with theannotation reported by Marinelli et al. (Marinelli et al., 2012), whichis referred to herein as Gp22/23 (FIG. 29). While no known function wasassigned to Gp22 or Gp23 of the original plus-strand annotation, theminus-strand annotation of Gp22/23 in PHL112N00 and PHL111M01 showedmodest similarity to a zinc finger protein from Arthroderma gypseum(E-values 1.0e-4 and 5.7e-4, respectively). The PHL111M01 annotationalso showed similarity to a polyprenyl diphosphate synthase fromStreptomyces albus (E-value 2.6e-5) and a polyketide synthase from theFrankia genus (E-value 2.5e-4). The minus-strand Gp22/23 ORFs from mostgenomes are homologs of each other, except those in PHL067M10 andPHL114L00. The Gp22/23 ORFs in these two genomes form a separate groupand share little nucleotide similarity to the other Gp22/23 ORFs despitebeing present at the same locus in the genome. The observation that thisORF appears in all the phage genomes on the minus strand suggests thatthis region may be part of the right arm. This is consistent withprevious reports of a plus-strand transcriptional terminator thatseparates Gp22/23 from the rest of the left arm in PA6, PAD20, and PAS50[11].

Homologs of the PA6 ORFs Gp42, Gp45, and Gp46, which occur in theright-arm of the genome near the ˜1 kb non-coding region, were notconsistently identified. The expected locations of each of theseright-arm ORFs in the phage genomes frequently contained numerous stopcodons and showed limited homology to corresponding regions of the PA6genome. This is consistent with the generally high degree of nucleotidevariation near the non-coding region and suggests that these ORFs mayrepresent genes that are differentially present among different phagestrains.

The sequencing data demonstrated that the ends of the phage genomes areflanked by 11-nucleotide single-stranded overhangs (Table 3-1). In thesequencing data of 10 phage genomes, 1-3 reads that span both the 3′ and5′ ends of the genomes were found. The genome ends in these reads areconsistently separated by a sequence that matches the 11-ntsingle-stranded extension previously reported (Marinelli et al., 2012).However, based on the sequencing data, the presence of overhangs inthree of the 15 genomes: PHL010M04, PHL073M02, and PHL071N05, were notshown. It is possible that they were simply not detected, asoverhang-containing reads were rarely observed in general (2.3 overhangreads per 10,000 reads). Nevertheless, the data do suggest that thephage DNA could be circularized at some point in their life cycle, aspreviously proposed [11]. The absence of the overhang sequence in readsthat map to only one end of the genome may be an artifact of sampleprocessing, as T4 DNA polymerase is used to ‘polish’ fragmented libraryDNA by digesting 3′ single-strand extensions and extending thecomplement of 5′ single-strand extensions (Roche Diagnostics, 2009). Ifso, it is surmisable that the overhang may exist on the 3′ ends of thegenome.

Host Range and Specificity of P. acnes Phages

To investigate the host range and specificity of P. acnes phages, the 15sequenced phages were screened against a panel of 69 Propionibacteriumstrains, including 65 P. acnes strains, three P. humerusii strains, andone P. granulosum strain. Except for the P. acnes strains KPA171202 andATCC11828, all of these Propionibacterium strains were isolated from thesame cohort of subjects sampled for phages. The genomes of all 65 P.acnes strains and three P. humerusii strains were sequenced. Aphylogenetic tree of the 65 P. acnes strains based on the SNPs in theircore genomic regions was constructed (FIG. 33, left dendrogram). Basedon the previously established typing of P. acnes strains by their RecAgene sequences [7], the bacterial collection included all major lineagesof P. acnes found on the human skin, with multiple strains representingeach type: IA-1, IA-2, IB-1, IB-2, IB-3 and II. Thesusceptibility/resistance of each of the 69 bacterial strains againsteach of our 15 sequenced phages was determined using a crossstreakmethod. In total, 1,035 bacterium-phage interactions were determined.Each experiment was repeated at least five times. For the bacterialstrains that showed resistance to phages, the fold changes in efficiencyof plaquing (EOP) was determined relative to the P. acnes strainATCC6919, which is known to be susceptible to all tested strains.

It was found that the susceptibility/resistance to phage is correlatedwith the P. acnes lineages. Five of the 69 Propionibacterium strainsshowed a 100-fold or greater increase in resistance against at least onephage. P. acnes strains of types IA-1, IA-2, IB-1, and IB-2 were allsusceptible to all tested phages. However, two strains of type IB-3(KPA171202 and HL030PA1) were highly resistant to some of the phages(FIG. 34). Type IB-3 strains encode components of a type III restrictionmodification system (genes PPA1611 and PPA1612 in KPA171202). This mayexplain their resistance to phages. KPA171202 encodes a cryptic prophagein the genome [17]. However, the sequence of the prophage is not relatedto any of the sequenced P. acnes phages, therefore, the presence of thecryptic prophage is unlikely to explain the resistance to phages. Threetype II strains were also highly resistant to some of the phages. Thisis consistent with previous observations that strains of this type weremore frequently resistant to phages [14].

On the other hand, the susceptibility/resistance of P. acnes strains tophages did not correlate with phage lineages (r=0.1343, p-value=0.115,Mantel test). Even the host ranges among closely-related phage strainsin Group I or Group II are different (FIG. 34). One example isPHL066M04, a Group I phage that showed little similarity to other phagesin the same group, but had a similar bacterium-phage interaction patternto the Group II phages PHL115M02 and PHL037M02. These results suggestthat bacterial factors may play an important role in determining thephage host range and specificity.

To determine whether these phages are specific to only P. acnes or ifthey are capable of interacting with other Propionibacteria, includedwere one P. granulosum strain and three P. humerusii strains in thebacterium-phage interaction experiment. P. granulosum is a common skincommensal with approximately 1.1% abundance in the pilosebaceous unit[7]. P. humerusii is a newly-defined species [18]. In the study cohort,P. humerusii is one of the major species found on the skin with anabundance of 1.9% in the pilosebaceous unit [7]. It is closely relatedto P. acnes with >98% identity in the 16S rRNA gene sequence [18]. Whilethe P. granulosum strain showed strong resistance to all the phagestested, two P. humerusii strains, HL037PA2 and HL037PA3, weresusceptible to all the phages. The third P. humerusii strain, HL044PA1,was lysed by ten of the 15 phages tested. This suggests that the hostrange of P. acnes phages is not limited to P. acnes but also includes P.humerusii and possibly other closely-related Propionibacterium species.

Resistance to Bacteriophages Does not Correlate with the Presence ofMatching CRISPR Spacers in P. acnes Strains

Among the 65 P. acnes isolates, eight strains belong to RT2 and RT6(RecA type II) and encode CRISPR/Cas genes, which function as abacterial adaptive immune mechanism against foreign DNA. These RT2 andRT6 strains each have one to nine spacers, 33 nucleotides long, in theirCRISPR arrays. In total, they encode 42 spacers, 28 of which are unique.

Whether the CRISPR/Cas mechanism can explain phagesusceptibility/resistance in the RT2 and RT6 strains was investigated.Protospacers in the 15 phage genomes that match a spacer sequence fromthe RT2 and RT6 P. acnes strains were identified. Up to two mismatcheswere allowed in the sequence alignments. In all phages, protospacersthat match the spacers in at least two P. acnes strains were identified(FIG. 35). These protospacers are all single-copied in the phage genomesand are located primarily on the left arm (FIG. 36). Their locations aregenerally conserved among all other phage genomes harboring the sameprotospacer sequences.

The susceptibility/resistance patterns of the eight RT2 and RT6 P. acnesstrains showed little correlation with either the number of spacers ineach array that had protospacer matches (r=0.207) or whether at leastone match could be found against the CRISPR array in general (r=0.202).Susceptibility/resistance to phages also did not correlate with thepattern with which any specific spacer matched (maximum absolutecorrelation 0.051).

Phages can escape the CRISPR defense mechanism by mutating sitesinvolved in protospacer recognition. The short nucleotide motifdownstream of the protospacer, known as the protospacer-adjacent motif(PAM), is highly conserved among targets of CRISPR/Cas systems [19].Mutations in these nucleotides have been found to disruptCRISPR-mediated resistance despite complete complementarity in theprotospacer sequence [20-22]. To determine whether the lack ofcorrelation between bacterial susceptibility/resistance and the presenceof matching spacer sequences is due to mutations within the PAMsequence, the PAMs of the nine protospacers that have exact matches tothe spacer sequences encoded in HL042PA3 were examined. Six of theseprotospacers come from phages that HL042PA3 was resistant to, while theother three protospacers are from the phages that were able to lyseHL042PA3. Among the six protospacers, sequence conservation at severalsites within their 33-nucleotide length and within the ten downstreamnucleotides expected to contain the PAMs were observed (FIG. 37). Thissuggests that these protospacer motifs are conserved and can be targetedby HL042PA3 CRISPRs. However, these same nucleotide positions are alsoconserved in the three protospacers from other phages (PHL113M01,PHL112N00, and PHL085M01) that were able to lyse HL042PA3 (FIG. 37).Thus, the conservation of protospacer motifs including the PAMs cannotexplain the lack of correlation between bacterialsusceptibility/resistance and the presence of matching spacer sequences

In summary, the data demonstrate that encoding CRISPR spacers that matchagainst the genome of an invading phage is not sufficient for aneffective defense, suggesting that transcriptional and/or translationalregulation of CRISPR RNA and Cas gene expression may also be requiredfor CRISPR-mediated resistance. Interactions between these bacteria andphages may also depend on additional phage and bacterial componentsinvolved in phage binding, entry, replication, or release.

A diverse group of P. acnes bacteriophages that reside on the human skinhas been revealed. Most of the sequenced phages show moderately highgenetic similarity with certain strains forming closely-related groups.These phages show various patterns of interaction with P. acnes and P.humerusii strains, but these patterns do not correlate with phagephylogeny. It was determined that resistance or susceptibility to phagescorrelated well with P. acnes lineages. Types IA-1, IA-2, IB-1, and IB-2were all susceptible to all tested phages, while certain strains oftypes IB-3 and II were resistant to some phages. Phage resistance intype II P. acnes strains does not correlate with the presence of CRISPRspacers that match to phage protospacers, suggesting that additionalmechanisms, such as regulation of the CRISPR/Cas system and/or otherantiviral mechanisms, are needed in conferring the phage resistance.

This study suggests an important regulatory role of P. acnesbacteriophages in the skin microbiome. The strain-specific host rangesdemonstrate the ability of these phages to regulate particular subsetsof the P. acnes population and P. humerusii population. Among thesesubsets of Propionibacterium populations, phages may also disseminategenes that potentially modify virulence, as suggested by Lood and Collin[11], or competitiveness, as it was suggested that gp22/23 encoded insome phages may be potentially involved in the production of polyketideantimicrobials. Both the selective lysis and modification of P. acnesstrains by phages potentially regulates the relative abundances of thecommensal and pathogenic strains of P. acnes on the skin. This delicatebalance between commensals and pathogens can be especially important forskin health and disease at sites where P. acnes dominates. Based on themetagenomic shotgun sequencing data, it is estimated that the ratiobetween P. acnes phage and P. acnes in the pilosebaceous unit is 1:20[7], which is far different from the phage:bacteria ratios estimated inenvironmental microbial communities, where viruses typically outnumberbacteria [23]. This suggests that the human host also plays a role inselecting and regulating the composition and diversity of themicrobiome.

Materials and Methods Propionibacteria Culture

P. acnes, P. humerusii, and P.granulosum strains were cultured underanaerobic conditions in Clostridial media (Oxoid) at 37° C. for 4-6days. Propionibacterium cultures were used to prepare top agar overlaysfor phage culture on A media plates (12 g/L pancreatic digest of casein,12 g/L Difco yeast extract, 22.2 mM D-glucose, 29.4 mM g/L potassiumphosphate monobasic, 8 mM magnesium sulphate heptahydrate, 20 g/L Difcoagar).

Phage Isolation and DNA Extraction

Plaques found on skin sample culture plates were isolated by puncturingthe agar with a pipet tip and resuspending in 50 μL SM buffer (0.1 Msodium chloride, 8 mM magnesium sulfate heptahydrate, 1M Tris-HCl, pH7.5, 2% gelatin, 1 mM calcium chloride). The phage resuspension wasspread onto A media plates with top agar containing P. acnes strain ATCC6919. After incubation at 37° C. for 2 days, phages were eluted with 8mL SM buffer at room temperature, filtered with 0.22 uM PES filter(Millipore), and stored at 4° C. Phage titers were determined by plaqueassay.

Phage DNA extraction was performed using the Lambda Mini Kit (Qiagen)with the following modifications. Phage particles were precipitated inBuffer L2 by centrifugation at 20,000 g for 1 hour. Extracted DNA waseluted with Buffer QF and precipitated with isopropanol overnight at−20° C. before centrifugation.

Phage Genome Sequencing and Annotation

Phage genomes were sequenced in multiplex using the Roche GS FLXTitanium or Illumina MiSeq platforms. De novo assembly of reads wasperformed with MIRA [24], and the resulting contigs were manuallyfinished in Consed [25]. For phages covered by more than 20,000 reads,assembly was performed on a randomly-selected subset of 10,000 reads for454 data or 20,000 reads for MiSeq data. Fully assembled phage genomeswere annotated using Genemark.hmm [26] and Glimmer v3.02 [27].

Genome Sequence Alignment and Phylogenetic Tree Construction

Sequences present in all 16 phage genomes were defined as core regionsof the phage genome. To identify these core regions, alignments werefirst generated between the PA6 genome and each of the other 15 phagegenomes using Nucmer [28]. This yielded 15 sets of starting and endingcoordinates describing intervals within the PA6 genome that align withany given phage genome. The core regions were then calculated for allphages by determining the overlapping intervals between all of the 15coordinate sets. The core region sequences were concatenated for thesubsequent multiple sequence alignments. Single nucleotide polymorphisms(SNPs) on the core regions were identified by using the “show-snps”option of Nucmer with the default setting. Using MEGA5 [29],phylogenetic trees were constructed by the Neighbor Joining method onp-distances based on SNP sites. Bootstrapping was based on 200replicates.

Multiple sequence alignments of full-length phage genomes, left andright arm coding regions, head protein sequences, and amidase sequenceswere each generated with MAFFT [30] or Muscle [31]. Phylogenetic treeswere constructed in Seaview [32] based on the BioNJ method applied tothe Jukes-Cantor distances between the sequences. All trees werebootstrapped for 5,000 replicates.

Determination of Variation Sites

Multiple sequence alignments of Group I and Group II phages weregenerated using MAFFT [30]. In each of these alignments, the positionsof all mismatches and gaps (discrepancies) were recorded relative to areference genome that was chosen at random. Contiguous gaps in thereference genome were counted as a single discrepancy. The referencegenome was divided into 50-nucleotide windows, and the discrepancydensity of each sequence window was calculated as the total number ofdiscrepancies it contained. Densities were plotted in Artemis [33].

To determine the single-nucleotide variations within each strain, allread data for each phage, including reads not initially included in thede novo genome assembly, were mapped to their corresponding genomesusing Mira. as the sites in each phage genome assembly.

Bacterial Resistance Test

The susceptibilities/resistances of Propionibacterium strains against 15phages were determined using a modified cross-streak assay. Thebacterial strains were cultured and streaked in parallel across A mediaplates (5-6 isolates on each plate, ˜1 cm apart, along with ATCC 6919 asa control). Approximately 5 μL of 106 pfu/mL phage suspension wasapplied onto each streak, and then the plates were incubated at 37° C.anaerobically for 2 days. At least five replicates of each cross-streakexperiment were performed to determine whether the strains weresusceptible or resistant judged based on lysis. The resistance of thebacterial strains was further quantified by assaying the efficiency ofplaquing of the phages relative to P. acnes strain ATCC 6919, calculatedas the following:

${Resistance} = {\frac{1}{{Efficiency}\mspace{14mu} {of}\mspace{14mu} {Plaquing}} = \frac{\begin{matrix}{{Titer}{\mspace{11mu} \;}{of}\mspace{14mu} {Phage}\mspace{14mu} {Strain}{\mspace{11mu} \;}X} \\{{on}\mspace{14mu} {ATCC}\; 6919}\end{matrix}}{\begin{matrix}{{Titer}\mspace{14mu} {of}\mspace{14mu} {Phage}\mspace{14mu} {Strain}{\mspace{11mu} \;}X} \\{{on}\mspace{14mu} {Bacterial}{\mspace{11mu} \;}{Strain}\mspace{14mu} Y}\end{matrix}}}$

A 100-fold or greater increase in efficiency of plaquing was consideredto be evidence of resistance.

Phage Interaction Correlation

To determine whether genetically similar phages have similar host rangeand specificity, the correlation between their phylogenetic andphenotypic relationships was calculated, the latter based on resultsfrom the bacterial resistance test. Each column in the bacterialresistance table, which represents the host range of a given phage, wasconverted to binary form by assigning 1 to instances of resistance and 0to instances of susceptibility. The Euclidean distance between eachcolumn was used to calculate a phenotype distance matrix between allphages. A phylogenetic distance matrix among the phage genomes wascalculated using MEGA5 [29]. Using the ade4 package [34] in R, a Manteltest was performed on the phenotype and phylogenetic distance matricesto determine the correlation between the two. 10,000 permutations wereperformed.

CRISPR Search

CRISPR spacer sequences were identified in P. acnes genomes usingCRISPRfinder [35]. The extracted spacer sequences were aligned againstall phage sequences using BLASTn. Protospacers with up to two mismatcheswere identified.

Results

The genomes of 15 P. acnes phages isolated from human skin weresequenced. The phage genomes showed moderately high sequence similarityand were comparable in size and organization. Based on a comparison ofthe genomes, most phages diverge from each other, while some of themform closely-related groups that were not described previously. Whentested against a panel of 69 Propionibacterium strains, these phageslysed all P. acnes strains except some strains from type IB-3 and II.Some of the phages were also able to lyse Propionibacterium humerusiistrains. It was found that bacterial susceptibility/resistance to phageshad no significant correlation with phage phylogeny or the presence ofthe CRISPR spacers in type II P. acnes strains that match theprotospacers in the phage genomes.

CONCLUSIONS

With 15 new phagel genomes, it was determined that the diversity of P.acnes phages is broader than previously described with novel groupsadded. The host range and specificity are different among the phages,but are not correlated with the phylogeny of phage genomes. It was alsofound that encoding CRISPR spacers that match to phage genomes is notsufficient to confer P. acnes resistance to phages. This study providesnew insight into the potential application of phages in treating acneand other P. acnes associated diseases.

REFERENCES CITED IN EXAMPLE 8

-   1. Mc Grath S, van Sinderen D (eds): Bacteriophage: Genetics and    Molecular Biology. Norfolk, U K: Caister Academic Press; 2007.-   2. Rohwer F: Global phage diversity. Cell 2003, 113:141.-   3. Rohwer F, Thurber R V: Viruses manipulate the marine environment.    Nature 2009, 459:207-212.-   4. Suttle C A, Chan A M, Cottrell M T: Infection of Phytoplankton by    Viruses and Reduction of Primary Productivity. Nature 1990,    347:467-469.-   5. Rodriguez-Valera F, Martin-Cuadrado A B, Rodriguez-Brito B, Pasic    L, Thingstad T F, Rohwer F, Mira A: Explaining microbial population    genomics through phage predation. Nature reviews Microbiology 2009,    7:828-836.-   6. Grice E A, Segre J A: The skin microbiome. Nature reviews    Microbiology 2011, 9:244-253.-   7. precedings.nature.com/documents/5305/version/1-   8. Bojar R A, Holland K T: Acne and Propionibacterium acnes. Clin    Dermatol 2004, 22:375-379.-   9. Leyden J J: The evolving role of Propionibacterium acnes in acne.    Semin Cutan Med Surg 2001, 20:139-143.-   10. Zierdt C H, Webster C, Rude W S: Study of the anaerobic    corynebacteria. International Journal of Systematic Bacteriology    1968, 18:33-47.-   11. Lood R, Collin M: Characterization and genome sequencing of two    Propionibacterium acnes phages displaying pseudolysogeny. BMC    genomics 2011, 12:198.-   12. Lood R, Morgelin M, Holmberg A, Rasmussen M, Collin M: Inducible    Siphoviruses in superficial and deep tissue isolates of    Propionibacterium acnes. BMC microbiology 2008, 8:139.-   13. Jong E C, Ko H L, Pulverer G: Studies on bacteriophages of    Propionibacterium acnes. Med Microbiol Immunol 1975, 161:263-271.-   14. Webster G F, Cummins C S: Use of bacteriophage typing to    distinguish Propionibacterium acne types I and II. Journal of    clinical microbiology 1978, 7:84-90.-   15. Farrar M D, Howson K M, Bojar R A, West D, Towler J C, Parry J,    Pelton K, Holland K T: Genome sequence and analysis of a    Propionibacterium acnes bacteriophage. Journal of bacteriology 2007,    189:4161-4167.-   16. Horvath P, Barrangou R: CRISPR/Cas, the immune system of    bacteria and archaea. Science 2010, 327:167-170.-   17. Brzuszkiewicz E, Weiner J, Wollherr A, Thurmer A, Hupeden J,    Lomholt H B, Kilian M, Gottschalk G, Daniel R, Mollenkopf H J, et    al: Comparative genomics and transcriptomics of Propionibacterium    acnes. PloS one 2011, 6:e21581.-   18. Butler-Wu S M, Sengupta D J, Kittichotirat W, Matsen F A, 3rd,    Bumgarner R E: Genome sequence of a novel species, Propionibacterium    humerusii. Journal of bacteriology 2011, 193:3678.-   19. Mojica F J, Diez-Villasenor C, Garcia-Martinez J, Almendros C:    Short motif sequences determine the targets of the prokaryotic    CRISPR defence system. Microbiology 2009, 155:733-740.-   20. Westra E R, van Erp P B, Kunne T, Wong S P, Steals R H, Seegers    C L, Bollen S, Jore M M, Semenova E, Severinov K, et al: CRISPR    immunity relies on the consecutive binding and degradation of    negatively supercoiled invader DNA by Cascade and Cas3. Molecular    cell 2012, 46:595-605.-   21. Semenova E, Jore M M, Datsenko K A, Semenova A, Westra E R,    Wanner B, van der Oost J, Brouns S J, Severinov K: Interference by    clustered regularly interspaced short palindromic repeat (CRISPR)    RNA is governed by a seed sequence. Proceedings of the National    Academy of Sciences of the United States of America 2011,    108:10098-10103.-   22. Semenova E, Nagomykh M, Pyatnitskiy M, Artamonova, II, Severinov    K: Analysis of CRISPR system function in plant pathogen Xanthomonas    oryzae. FEMS microbiology letters 2009, 296:110-116.-   23. Srinivasiah S, Bhaysar J, Thapar K, Liles M, Schoenfeld T,    Wommack K E: Phages across the biosphere: contrasts of viruses in    soil and aquatic environments. Research in microbiology 2008,    159:349-357.-   24. Chevreux B, Pfisterer T, Drescher B, Driesel A J, Muller W E,    Wetter T, Suhai S: Using the miraEST assembler for reliable and    automated mRNA transcript assembly and SNP detection in sequenced    ESTs. Genome research 2004, 14:1147-1159.-   25. Gordon D, Abajian C, Green P: Consed: a graphical tool for    sequence finishing. Genome research 1998, 8:195-202.-   26. Lukashin A V, Borodovsky M: GeneMark.hmm: new solutions for gene    finding. Nucleic acids research 1998, 26:1107-1115.-   27. Delcher A L, Harmon D, Kasif S, White O, Salzberg S L: Improved    microbial gene identification with GLIMMER. Nucleic acids research    1999, 27:4636-4641.-   28. Kurtz S, Phillippy A, Delcher A L, Smoot M, Shumway M, Antonescu    C, Salzberg S L: Versatile and open software for comparing large    genomes. Genome biology 2004, 5:R12.-   29. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S:    MEGA5: molecular evolutionary genetics analysis using maximum    likelihood, evolutionary distance, and maximum parsimony methods.    Molecular biology and evolution 2011, 28:2731-2739.-   30. Katoh K, Misawa K, Kuma K, Miyata T: MAFFT: a novel method for    rapid multiple sequence alignment based on fast Fourier transform.    Nucleic acids research 2002, 30:3059-3066.-   31. Edgar R C: MUSCLE: multiple sequence alignment with high    accuracy and high throughput. Nucleic acids research 2004,    32:1792-1797.-   32. Gouy M, Guindon S, Gascuel O: SeaView version 4: A multiplatform    graphical user interface for sequence alignment and phylogenetic    tree building. Molecular biology and evolution 2010, 27:221-224.-   33. Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P,    Rajandream M A, Barrell B: Artemis: sequence visualization and    annotation. Bioinformatics 2000, 16:944-945.-   34. Dray S, Dufour A B: The ade4 package: Implementing the duality    diagram for ecologists. Journal of Statistical Software 2007,    22:1-20.-   35. Grissa I, Vergnaud G, Pourcel C: CRISPRFinder: a web tool to    identify clustered regularly interspaced short palindromic repeats.    Nucleic acids research 2007, 35:W52-57.-   (Nature Precedings Paper). (n.d.). Retrieved from    precedings.nature.com/documents/5305/version/1-   Bojar, R. a, & Holland, K. T. (2004). Acne and Propionibacterium    acnes. Clinics in dermatology, 22(5), 375-9.    doi:10.1016/j.clindermatol.2004.03.005-   Farrar, M. D., Howson, K. M., Bojar, R. a, West, D., Towler, J. C.,    Parry, J., Pelton, K., et al. (2007). Genome sequence and analysis    of a Propionibacterium acnes bacteriophage. Journal of bacteriology,    189(11), 4161-7. doi:10.1128/JB.00106-07-   Grice, E. a, & Segre, J. a. (2011). The skin microbiome. Nature    reviews. Microbiology, 9(4), 244-53. doi:10.1038/nrmicro2537-   Leyden, J. J. (2001). The evolving role of Propionibacterium acnes    in acne. Seminars in cutaneous medicine and surgery, 20(3), 139-43.    doi:10.1053/sder.2001.28207 Lood, R., & Collin, M. (2011).    Characterization and genome sequencing of two Propionibacterium    acnes phages displaying pseudolysogeny. BMC genomics, 12(1), 198.    doi:10.1186/1471-2164-12-198-   Marinelli, L. J., Fitz-gibbon, S., Hayes, C., Bowman, C., Inkeles,    M., Loncaric, A., & Russell, D. A. (2012). Propionibacterium acnes    Bacteriophages Display Limited Genetic Diversity and Broad Killing    Activity against Bacterial Skin Isolates. mBio, 3(5), 1-13.    doi:10.1128/mBio.00279-12.Editor-   Mc Grath, S., & van Sinderen, D. (eds). (2007). Bacteriophage:    Genetics and Molecular Biology. Norfolk, U K: Caister Academic    Press.-   Roche Diagnostics. (2009). G S FLX Titanium General Library    Preparation Method Manual. Mannheim: Roche Diagnostics GmbH.-   Rodriguez-Valera, F., Martin-Cuadrado, A.-B., Rodriguez-Brito, B.,    Pasić, L., Thingstad, T. F., Rohwer, F., & Mira, A. (2009).    Explaining microbial population genomics through phage predation.    Nature reviews. Microbiology, 7(11), 828-36. doi:10.1038/nrmicro2235-   Rohwer, F. (2003). Global phage diversity. Cell, 113(2), 141.-   Zierdt, C. H., Webster, C., & Rude, W. S. (1968). Study of the    anaerobic corynebacteria. International Journal of Systematic    Bacteriology, 18, 33-47.

All the strains of RT4, RT5, and RT8 show sensitivity to all of thephages shown in Table 5. Therefore, acne patients may be treated withphage by using phage strains that are listed in Table 5:

TABLE 5 Host range and specificity of P. acnes phages RecA CRISPR/ TypeRibotype Cas PHL113M01 PHL111M01 PHL082M00 PHL060L00 PHL067M10 PHL071N05PHL112N00 HL036PA1 IA 532 − S S S S S S S HL036PA2 532 − S S S S S S SHL036PA3 1 − S S S S S S S HL046PA2 1 − S S S S S S S HL002PA3 1 − S S SS S S S HL002PA2 1 − S S S S S S S HL005PA2 1 − S S S S S S S HL005PA3 1− S S S S S S S HL020PA1 1 − S S S S S S S HL027PA2 1 − S S S S S S SHL100PA1 1 − S S S S S S S HL087PA2 1 − S S S S S S S HL013PA2 1 − S S SS S S S HL063PA1 1 − S S S S S S S HL072PA1 5 − S S S S S S S HL072PA2 5− S S S S S S S HL106PA2 1 − S S S S S S S HL099PA1 4 − S S S S S S SHL083PA1 1 − S S S S S S S HL038PA1 4 − S S S S S S S HL074PA1 4 − S S SS S S S HL005PA1 4 − S S S S S S S HL056PA1 4 − S S S S S S S HL053PA1 4− S S S S S S S HL045PA1 4 − S S S S S S S HL007PA1 4 − S S S S S S SHL096PA1 5 − S S S S S S S HL043PA1 5 − S S S S S S S HL043PA2 5 − S S SS S S S HL078PA1 1 − S S S S S S S HL086PA1 IB 8 − S S S S S S SHL082PA1 8 − S S S S S S S HL110PA2 8 − S S S S S S S HL053PA2 8 − S S SS S S S HL092PA1 8 − S S S S S S S HL110PA1 8 − S S S S S S S HL025PA1 1− S S S S S S S HL030PA2 3 − S S S S S S S HL063PA2 3 − S S S S S S SHL037PA1 3 − S S S S S S S HL059PA1 16 − S S S S S S S HL059PA2 16 − S SS S S S S HL025PA2 3 − S S S S S S S HL005PA4 3 − S S S S S S S HL067PA13 − S S S S S S S HL002PA1 3 − S S S S S S S HL027PA1 3 − S S S S S S SHL046PA1 3 − S S S S S S S HL083PA2 3 − S S S S S S S HL013PA1 3 − S S SS S S S HL050PA1 3 − S S S S S S S HL050PA3 3 − S S S S S S S HL087PA1 3− S S S S S S S HL087PA3 3 − S S S S S S S KPA171202 1 − S >10⁴ S >10⁴>10⁵ S >10⁷ HL030PA1 1 − >10⁴ S S S >10² S >10³ HL050PA2 II 1 − S S S S>10⁴ S S HL060PA1 2 + S S S S S S S HL082PA2 2 + S S S S S S S HL001PA12 + S >10⁷ S S >10⁷ S S HL106PA1 2 + S S S S S S S ATCC 11828 2 + S S SS S S S HL110PA3 6 + S S S S S S S HL110PA4 6 + S S S S S S S HL042PA36 + S >10⁶ >10⁶ S >10⁷ >10⁵ S HL037PA2 — − S S S S S S S HL037PA3 P.humerusii − S S S S S S S HL044PA1 P. humerusii − S S S >10³ S S >10⁴HL078PG1 — P. >10⁷ >10⁷ >10⁷ >10⁷ >10⁷ >10⁷ >10⁷ granulosum PHL037Z02PHL085N00 PHL115M02 PHL085M01 PHL114L00 PHL073M02 PHL010M04 PHL066M04HL036PA1 S S S S S S S S HL036PA2 S S S S S S S S HL036PA3 S S S S S S SS HL046PA2 S S S S S S S S HL002PA3 S S S S S S S S HL002PA2 S S S S S SS S HL005PA2 S S S S S S S S HL005PA3 S S S S S S S S HL020PA1 S S S S SS S S HL027PA2 S S S S S S S S HL100PA1 S S S S S S S S HL087PA2 S S S SS S S S HL013PA2 S S S S S S S S HL063PA1 S S S S S S S S HL072PA1 S S SS S S S S HL072PA2 S S S S S S S S HL106PA2 S S S S S S S S HL099PA1 S SS S S S S S HL083PA1 S S S S S S S S HL038PA1 S S S S S S S S HL074PA1 SS S S S S S S HL005PA1 S S S S S S S S HL056PA1 S S S S S S S S HL053PA1S S S S S S S S HL045PA1 S S S S S S S S HL007PA1 S S S S S S S SHL096PA1 S S S S S S S S HL043PA1 S S S S S S S S HL043PA2 S S S S S S SS HL078PA1 S S S S S S S S HL086PA1 S S S S S S S S HL082PA1 S S S S S SS S HL110PA2 S S S S S S S S HL053PA2 S S S S S S S S HL092PA1 S S S S SS S S HL110PA1 S S S S S S S S HL025PA1 S S S S S S S S HL030PA2 S S S SS S S S HL063PA2 S S S S S S S S HL037PA1 S S S S S S S S HL059PA1 S S SS S S S S HL059PA2 S S S S S S S S HL025PA2 S S S S S S S S HL005PA4 S SS S S S S S HL067PA1 S S S S S S S S HL002PA1 S S S S S S S S HL027PA1 SS S S S S S S HL046PA1 S S S S S S S S HL083PA2 S S S S S S S S HL013PA1S S S S S S S S HL050PA1 S S S S S S S S HL050PA3 S S S S S S S SHL087PA1 S S S S S S S S HL087PA3 S S S S S S S S KPA171202 >10³ >10⁴>10⁴ >10⁴ >10⁴ >10⁴ >10⁵ >10⁷ HL030PA1 >10³ >10³ >10³ >10² >10³ >10⁴>10⁵ >10⁴ HL050PA2 S S S S S S S S HL060PA1 S S S S S S S S HL082PA2 S SS S S S S S HL001PA1 S S S S >10⁶ S S S HL106PA1 S S S S S S S S ATCC11828 S S S S S S S S HL110PA3 S S S S S S S S HL110PA4 S S S S S S S SHL042PA3 >10⁷ >10⁶ >10⁷ S >10⁷ >10⁷ >10⁷ >10⁷ HL037PA2 S S S S S S S SHL037PA3 S S S S S S S S HL044PA1 >10³ S >10² S S S S >10² HL078PG1 >10⁷>10⁷ >10⁷ >10⁷ >10⁷ >10⁷ >10⁷ >10⁷ S susceptible 10^(>) fold increase inresistance

Strains in the IB-3 lineage show resistance against most of the testedphages. Therefore, patients with those strains may not benefit as muchfrom phage therapy. SEQ ID NOs 55-81 include four unique genomicsequences for strains in the IB-3 lineage and for several other strains,such as IB-3-s1 (IB-3 and SK187), IB-3-s2 (IB-3 and HL025PA1), IB-3-s3(IB-3 and HL201PA1), IB-3-s4 (IB-3 and HL201PA1). The sequencesimilarities range from 95% to 100%. Primers targeting these sequencescan be used to estimate and predict the effectiveness of phage therapy.

FIG. 26 shows the phylogenetic tree of the 32 phages including the 18sequenced phages. There are phage strains highly similar to each other,such as the ones in Groups I and II. This suggests that the same phagescan be found in different individuals and supports that a particularphage strain can be used as a common treatment agent for differentindividuals. SEQ ID NOs 33-50 reflect the genome sequences of the 18sequenced phages, including the 15 phages shown in Table 5.

Potential Therapeutic Phage for Patients with Microbiome Type I Include:

PHL113M01, PHL111M01, PHL082M00, PHL060L00, PHL067M10, PHL071N05,PHL112N00, PHL037M02, PHL085N00, PHL115M02, PHL085M01, PHL114L00,PHL073M02, PHL010M04, and PHL066M04.

Potential Therapeutic Phew for Patients with Microbiome Type I with IB-3Strain Include:

PHL082M00 and PHL071N05.

Potential Therapeutic Phage for Patients with Microbiome Type IIInclude:

PHL113M01, PHL060L00, PHL112N00, and PHL085M01.

Potential Therapeutic Phage for Patients with Microbiome Type III orDominant RT8 Include:

PHL113M01, PHL111M01, PHL082M00, PHL060L00, PHL067M10, PHL071N05,PHL112N00, PHL037M02, PHL085N00, PHL115M02, PHL085M01, PHL114L00,PHL073M02, PHL010M04, and PHL066M04.

Potential Therapeutic Phage for Patients with Microbiome Type IVInclude:

PHL113M01, PHL111M01, PHL082M00, PHL060L00, PHL067M10, PHL071N05,PHL112N00, PHL037M02, PHL085N00, PHL115M02, PHL085M01, PHL114L00,PHL073M02, PHL010M04, and PHL066M04.

Potential Therapeutic Phage for Patients with Microbiome Type V Include:

PHL113M01, PHL111M01, PHL082M00, PHL060L00, PHL067M10, PHL071N05,PHL112N00, PHL037M02, PHL085N00, PHL115M02, PHL085M01, PHL114L00,PHL073M02, PHL010M04, and PHL066M04.

Specific Interactions Between Propionibacterium humerusii and P. acnesPhages

Some of the P. acnes phage strains can lyse a closely relatedPropionibacterium species, P. humerusii, which has been hypothesized tobe associated with infection in prostheses. P. acnes phage strains thatcan lyse P. humerusii strains can be potentially used as a therapeuticagent for P. humerusii associated diseases.

Potential therapeutic phage for P. humerusii associated diseasesinclude:

PHL113M01, PHL111M01, PHL082M00, PHL067M10, PHL071N05, PHL085N00,PHL085M01, PHL114L00, PHL073M02, and PHL010M04. ORFs in Phage GenomesThat Show Identity of 85% or Less to Their PA6 Homolog

percent nucleotide difference phage_gene name differences relative toPA6 PA6 ORF length PA6_gp10 372 PHL111M01_gp9 56 0.849462366PHL112N00_gp9 60 0.838709677 PHL114L00_gp10 60 0.838709677PHL010M04_gp11 56 0.849462366 PHL082M00_gp10 60 0.838709677 PA6_gp19 747PHL114L00_gp19 120 0.83935743 PHL010M04_gp20 138 0.815261044PHL066M04_gp20 138 0.815261044 PHL073M02_gp20 138 0.815261044 PA6_gp44309 PHL060L00_gp47 83 0.731391586 PHL010M04_gp43 75 0.757281553PHL066M04_gp43 74 0.760517799 PHL073M02_gp43 74 0.760517799PHL067M10_gp43 74 0.760517799 PHL082M00_gp43 66 0.786407767 PA6_gp33 357PHL115M02_gp37 54 0.848733496 PHL085M01_gp32 54 0.848739496PHL037M02_gp31 54 0.848739496 PHL085N00_gp32 54 0.848739496 PA6_gp45 183PHL111M01_gp44. 33 0.819672131 PA6_gp21 402 PHL112N00_gp20 710.823383085 PHL010M04_gp22 85 0.788557214 PHL066M04_gp22 84 0.791044776PHL073M02_gp22 78 0.805970149 PBL071N05_gp21 61 0.848258706PHL067M10_gp22 63 0.843283582 PHL115M02_gp26. 65 0.838308458PHL085M01_gp22 65 0.838308458 PHL037M02_gp21 65 0.838308458PHL085N00_gp22 65 0.838308458 PA6_gp40 228 PHL111M01_gp41 37 0.837719298PHL060L00_gp42 37 0.837719298 PHL112N00_gp39 54 0.763157895PHL113M01_gp43 38 0.833333333 PHL114L00_gp40 40 0.824561404PHL010M04_gp40 50 0.780701754 PHL066M04_gp40 50 0.780701754PHL073M02_gp40 50 0.780701754 PHL071N05_gp39 39 0.828947368PHL067M10_gp40 37 0.837719298 PHL115M02_gp45 43 0.811403509PHL085M01_gp40 43 0.811403509 PHL037M02_gp39 43 0.811403509PHL085N00_gp40 43 0.811403509 PHL082M00_gp39 42 0.815789474 PA6_gp22_23504 PHL114L00_gp22 187 0.628968254 PHL010M04_gp23 110 0.781746032PHL066M04_gp23 110 0.781746032 PHL073M02_gp23 110 0.781746032PHL067M10_gp23 186 0.630952381 PA6_gp29 567 PHL060L00_gp30 880.844797178 PHL112N00_gp27 105 0.814814815 PHL114L00_gp28 98 0.827160494PA6_gp35 471 PHL114L00_gp33 76 0.838641189 PA6_gp41 540 PHL111M01_gp42109 0.798148148 PHL060L00_gp43 104 0.807407407 PHL112N00_gp40 1180.781481481 PHL113M01_gp44 107 0.801851852 PHL114L00_gp41 124 0.77037037PHL071N05_gp40 110 0.796296296 PHL067M10_gp41 119 0.77962963PHL115M02_gp46 112 0.792592593 PHL085M01_gp41 112 0.792592593PHL037M02_gp40 112 0.792592593 PHL085N00_gp41 112 0.792592593PHL082M00_gp40 111 0.794444444 PA6_gp47 180 PHL115M02_gp50 350.805555556 PHL085M01_gp45 35 0.805555556 PHL037M02_gp44 35 0.805555556PHL111M01_gp45 43 0.761111111 PHL071N05_gp44.1 32 0.822222222PHL114L00_gp45.1 38 0.788888889 PHL113M01_gp47 31 0.827777778PHL085N00_gp45 35 0.805555556 PA6_gp24 393 PHL114L00_gp23 75 0.809160305PHL010M04_gp24 66 0.832061069 PHL066M04_gp24 66 0.832061069PHL073M02_gp24 66 0.832061069 PHL067M10_gp24 61 0.844783715PHL115M02_gp29 65 0.834605598 PHL085M01_gp24 65 0.834605598PHL037M02_gp23 65 0.834605598 PHL085N00_gp24 65 0.834605598 PA6_gp30 564PHL111M01_gp29 115 0.796099291 PHL113M01_gp31 105 0.813829787PHL082M00_gp29 116 0.794326241 PA6_gp18 264 PHL112N00_gp17 400.848484848 PHL082M00_gp18 41 0.84469697 PA6_gp37 948 PHL112N00_gp34 1680.82278481 PHL114L00_gp35 163 0.828059072 PA6_gp36 411 PHL111M01_gp36 800.805352798 PHL112N00_gp33 76 0.815085158 PHL113M01_gp38 82 0.800486618PHL114L00_gp34 86 0.790754258 PHL010M04_gp35 82 0.800486618PHL066M04_gp35 82 0.800486618 PHL073M02_gp35 82 0.800486618PHL071N05_gp34 82 0.800486618 PHL115M02_gp40 68 0.834549878PHL085M01_gp35 68 0.834549878 PHL037M02_gp34 68 0.834549878PHL085N00_gp35 68 0.834549878 PHL082M00_gp34 85 0.793187348

Example 9—Drug Development

Based on the foregoing, it is now known that some P. acnes strains areassociated with acne. Therefore, at the time of diagnosis, it will beuseful for dermatologists to know which strains are dominant on the skinof the patient. In order to do this, at first one needs to extractbacterial DNA from the skin sample of the patient. The method/kit toisolate bacterial DNA from the skin for downstream analysis detailedabove can be implemented in practice. After bacterial DNA is extracted,the fast and accurate detection/diagnosis method/kit to identify themicrobiome type of the patients, detailed above, can be implemented fordiagnosis. Once the microbiome type of the patient is diagnosed, severalapproaches can be used to treat the patient.

For example, if the patient has microbiome types IV or V, or isdominated by P. acnes RT10 strains, it is less likely antibiotictreatment would succeed, because these strains are antibiotic resistant.These patients should be treated using other therapies, such asretinoids or the methods. In the case that the patient has the virulentribotypes, including RT4, RT5, and RT8, drugs targeting specifically toRT4, RT5, and RT8, can be used. For example, small molecules, antisensemolecules, siRNA, biologics, antibodies, or combinations thereoftargeting the genetic elements and biological pathways unique to the P.acnes strains associated with acne, detailed above, can be used.

Example 10—Additional Therapies

In the case that the dominant P. acnes strains in the patient do notharbor a set of CRISPR/Cas, additional treatment of phage therapy basedon the foregoing can be used. For example, bacteriophage-basedstrain-specific therapy to treat acne can be employed. An alternativetreatment strategy is to balance the relative abundance of the P. acnesstrains by promoting the growth of health-associated strains. Thestrains associated with health can be used as probiotics. These can betopical creams, solutions, or other cosmetic products.

For prevention purposes, vaccine can be developed against virulentstrains of P. acnes.

Longitudinal studies determine whether the microbiome types change overtime and whether certain strains persist on subjects after treatment.

Inoculation experiments, inoculating virulent and healthy strains,determine whether P. acnes strain population changes.

Specific interactions between P. acnes strains and phages may bestudied.

Immune responses in human cells against different strains of P. acnesmay also be measured.

The following publications are incorporated herein by reference in theirentireties for all purposes, as are all other publications referencedherein and the Sequence Listing:

-   E. Grice et al., 324 Science 1190-1192 (2009).

1-46. (canceled)
 47. A method of treating a skin condition in a subjectin need thereof, the method comprising administering to the subject acomposition comprising a P. acnes strain associated with healthy skin.48. The method of claim 47, wherein the skin condition is acne.
 49. Themethod of claim 47, wherein the composition is formulated in a topicalcream or solution.
 50. The method of claim 47, wherein the compositionis formulated in a cosmetic product.
 51. The method of claim 47, whereinthe P. acnes strain associated with healthy skin does not have aribotype comprising RT4, RT5, RT7, RT8, RT9 or RT10.
 52. The method ofclaim 47, wherein the P. acnes strain lacks a plasmid having SEQ ID NO:31.
 53. The method of claim 47, wherein the P. acnes strain lacks alocus having SEQ ID NO:
 29. 54. The method of claim 47, wherein the P.acnes strain lacks a locus having SEQ ID NO:
 30. 55. The method of claim47, wherein the P. acnes is characterized as associated with healthyskin based on its 16S rDNA.
 56. The method of claim 47, whereintreatment comprises prevention of acne.
 57. The method of claim 47,further comprising treating the subject with an active ingredienttargeting a P. acnes associated with acne.
 58. The method of claim 47,wherein the P. acnes strain comprises CRISPR.
 59. A compositioncomprising a topical cream and a P. acnes strain associated with healthyskin.
 60. The composition of claim 59, wherein the P. acnes strainassociated with healthy skin does not have a ribotype comprising RT4,RT5, RT7, RT8, RT9 or RT10.
 61. The composition of claim 59, wherein theP. acnes strain lacks a plasmid having SEQ ID NO:
 31. 62. Thecomposition of claim 59, wherein the P. acnes strain lacks a locushaving SEQ ID NO:
 29. 63. The composition of claim 59, wherein the P.acnes strain lacks a locus having SEQ ID NO:
 30. 64. The composition ofclaim 59, wherein the P. acnes is characterized as associated withhealthy skin based on its 16S rDNA.
 65. The composition of claim 59,further comprising an active ingredient targeting a P. acnes associatedwith acne.
 66. The composition of claim 59, wherein the P. acnescomprises CRISPR.